How SIP makes phone calls happen

Posted on May 31, 2020

Every phone call consists of two components, signaling and media. Signaling does the work of establishing, maintaining, and tearing down the call. Media is the actual call audio. On a VoIP connection, media is broken up into digital packets for easy transportation between endpoints (phones). Signaling is the foundation of your phone calls. There’s much that has to go right for it to work. There are a few different signaling protocols, but we are going to be looking at Session Initiation Protocol (SIP) in this post.

SIP is a technology used to establish connections between two or more endpoints. SIP signaling has a few jobs. First, call signaling sets up the call so when you dial a number your phone system sends a SIP packet to your carrier. That SIP packet contains all of the data necessary to create the call to your new prospect. The SIP packet responsible for creating the call is known as the INVITE. Your carrier uses the INVITE as notification of an intended call and performs a quick LRN (Location Routing Number) lookup, if applicable, to find the number which you request in the “Request” portion of the SIP packet. The LRN procedure is critical for NANPA (which stands for North American Numbering Plan Administrator) phone numbers in these days of number portability, because your number can be on one carrier at 9 a.m. and on another a few hours later. Within about half a second your carrier has figured out where your call needs to go and sends your SIP packet, INVITE along with other information, to the number you dialed.

When your INVITE reaches the destination carrier, that carrier typically sends back a provisional response (which in SIP is ‘1xx’) which means, “Hold on a sec, I’m looking for that number on my network, and while I do you won’t be billed.”

When your INVITE arrives at its destination a few lines of information are riding along with the SIP packet. This is called the Session Description Protocol (SDP) and it makes the introduction in terms of how the media or “meat” of the call should be set up, including media ports to use and audio codecs the sender prefers.

When your call is finally answered, either by a person, voice mail or IVR robot, a ‘200’ response is sent back to your system indicating that your call has been received. The response also carries additional SDP parameters that state, “Here’s how I’m willing to talk” to finalize negotiations of the call. And guess what, there’s one more step in establishing a call. The calling party (you) sends back an acknowledgement (‘ACK’) that they’ve received the ‘200’. This ensures that you didn’t fall off during mid call negotiation.

At this point in the call media takes over. Once signaling does its job, media flows between the established ports as a series of digital packets. And now it’s a question of quality. Phone calls are a real-time communication medium. Unlike other Internet-based connections that also transport information in a series of bite-sized packets, if an audio packet gets dropped because of a temporarily dropped connection or lag, it can’t be resent. Resending packets will deliver them out of order, and then your syllables will be scrambled. So, in about a second (not including how long it took for the other end to pick up the phone), the call is established; the INVITE goes out with Request URI (Uniform Resource Indentifier), To, From, P-Asserted Identity or Remote Party ID information, the destination is found, and call parameters are negotiated and established.

To boost your quality it’s best to stay away from Internet service aggregators. These are companies that are reselling service from a bunch of providers depending on who’s cheaper. Instead you can cut out a transmission step and deal directly with the end provider. It’s the same principle we talked about in this post about audio codec transcoding.

While media does its thing SIP has its feet up but still checks in every once in a while, to make sure the call is still happening. These check ins are called Sessions Timers. You can set the length of Session Timers in your phone system programming. Session Timers are important because they make sure calls are ended in the event your network experiences any hiccups or if your phone becomes unresponsive causing it to incorrectly end calls. They do this by sending a REINVITE at pre-programmed intervals. If the call is still on, the other end sends back ‘100’, and the caller responds with ‘200’, then the callee responds with ‘ACK’. If any of those legs don’t happen, the SIP tells your phone system that the transaction has failed. At this point your system MUST send a ‘BYE’ to officially end the call dialogue.

PRO TIP: If your calls are being dropped at 32 seconds, you’ll often find that your network is either not sending the ‘ACK’ or it isn’t delivering the ‘ACK’ to your phone system.

Now that you know how phone calls work using SIP you can better understand how to manipulate your calls using the signaling fields for things like marketing (for starters). And that’s when your phone system transforms from a phone system to an empowering business tool.