Signaling and media: How SIP makes phone calls happen

Posted on February 12, 2014 by

If you hang out around the Internet doing research on phones long enough, you’re going to hear the terms “signaling” and “media”. It’s pretty obvious what each term means, but how do they make calls happen?


Every phone call consists of two components.

Signaling does the work establishing, maintaining, and tearing down the call. Media, is the actual call audio. On a VoIP connection, media is broken up into digital packets for easy transportation between endpoints (phones) based on parameters agreed upon by the signaling (more on that later).

Signaling is the foundation of your phone calls, and there’s a lot that has to go right for it to work. There are a few different signaling protocols, but for this post I’ll be talking about SIP. Not only because, as a SIP trunking provider, that’s where our expertise lays, but also because of the prom queen-like popularity of SIP trunking these days. As you may have read, SIP stands for Session Initiation Protocol, and it’s a technology used to establish connections between two or more endpoints.

SIP signaling has a few jobs.

First, call signaling sets up the call. When you dial a number, your phone system sends a SIP packet to your carrier. That SIP packet contains all the data necessary to create the call to your new prospect.

Sample SIP packet:

Via: SIP/2.0/UDP;branch=z9hG4bK3iqjm610fo1gaj8et4s1.1
From: ;tag=SDq7hif01-gK0a4605ac
Call-ID: SDq753ba286740f59780e6hif01-eb35742ad7223
Max-Forwards: 69
Contact: sip:17024797000@

The SIP packet which is responsible for creating the call is known as the INVITE. Your carrier uses the INVITE as notification of an intended call and performs a quick LRN (Location Routing Number) lookup, if applicable, to find the number which you request in the “Request” portion of the SIP packet. The LRN procedure is critical for NANPA phone numbers in these days of number portability, as your number can be on one carrier at 9 AM and on another by lunch time. Otherwise, your phone call might be sent to the carrier that originally hosted the number, even if they no longer do. If that happens, the destination caller will say, “sorry, wrong number” and hang up in your face: sales call over.

Within about half a second your carrier has figured out where your call needs to go and has sent your SIP packet, INVITE and all, to the number you dialed.

When your INVITE reaches the destination carrier, that carrier typically sends back a provisional response (which in SIP is ‘1xx’) which means, “ Hold on a sec, I’m looking for that number on my network, and while I do you won’t be billed.”

When your INVITE arrives at its destination, a few lines of information riding along with your SIP packet called Session Description Protocol (SDP), make the introduction in terms of how the media or “meat” of the call should be set up, including media ports to use and audio codecs the sender is partial to.

When your call is finally answered, either by a person or a voice mail or IVR robot, a ‘200’ response is sent back to your system indicating that your call has been received. The response also carries additional SDP parameters that state, “Here’s how I’m willing to talk” to finalize negotiations of the call.

And guess what, there’s one more step in establishing a call. The calling party (you) sends back an acknowledgement (‘ACK’) that they’ve received the ‘200’. This ensures that you didn’t fall off mid call negotiation.

PRO TIP: If your calls are being dropped at 32 seconds, you’ll often find that your network is either not sending the ‘ACK’ or it isn’t delivering the ‘ACK’ to your phone system.

So in about a second (not including how long it took for the other end to pick up the phone), the call is established; the INVITE goes out with Request URI, To, From, P-Asserted Identity or Remote Party ID information, the destination is found, call parameters are negotiated and established.

Then media takes over.

Once signaling does its job, media flows between the established ports as a series of digital packets. And now it’s a question of quality.

Phone calls are a real time communication medium. So unlike other Internet based connections that also transport information in a series of bite sized packets, if an audio packet gets dropped because of a temporarily dropped connection or lag, it can’t be resent. Resending packets will deliver them out of order, and then your syllables will be bledscram.

To boost your quality, stay away from Internet service aggregators, the ones reselling service from a bunch of providers depending on who’s cheaper. Cut out a transmission step and deal directly with the end provider. It’s the same principle we talked about in this post about audio codec transcoding.

While media does its thing, SIP has its feet up but still checks in every once in a while to make sure the call is still happening.

These check ins are called Sessions Timers. You can set the length of Session Timers in your phone system programing. Session timers are important because they make sure calls are ended in the event your network experiences any hiccups or if your phone becomes unresponsive causing it to not correctly end calls. They do this by sending a REINVITE at preprogrammed intervals. If the call is still on, the other end sends back ‘100’, to which the caller responds with ‘200’, to which the callee responds with ‘ACK’. If any of those legs don’t happen, the SIP tells your phone system the transaction has failed, at which point, your system MUST send a ‘BYE’ to officially end the call dialogue.

PRO TIP: Set up Session Timers between your PBX and carrier, not your PBX and your phones. Some phones are buggy and get confused by Session Timers to the point where they drop calls.

Now that you know how phone calls work using SIP, you can better understand how to manipulate your calls using the signaling fields for things like marketing (for starters). And that’s when your phone system goes from being a phone system to being an empowering business tool.