3
EVOLUTION OF VoIP SIGNALING
PROTOCOLS1
This chapter reviews the existing and emerging VoIP signaling and call control
protocols. In PSTN networks, ISUP (ISDN user part) and TCAP (transaction
capabilities application part) messages of the SS7 protocol [1] are commonly
used for call control and interworking of services.
The first generation (released in 1996) of VoIP signaling and media con-
trol protocols, such as ITU-T’s H.225/H.245—defined under ITU-T’s H.323
umbrella protocol [2]—was intended to o¤er LAN-based real-time VoIP ser-
vices. These protocols already had the proper ingredients (such as support of
ISUP messaging for call control) to support interworking with PSTN networks
as well. Consequently, there was a flurry of networking activities to deliver
VoIP services in LAN or within enterprises and to o¤er long-haul (inter-LATA
and international) transport of VoIP. The latter is also known as cheap and
wireless quality long-distance voice service over wireline network using IP. How-
ever, the telecom service providers found the following two problems with ver-
sion 1 of the H.323 protocol:
a. Many of the desired and advanced PSTN-domain call features and ser-
vices could not be easily implemented using H.323v1 because of its lack
of openness (i.e., all of the procedures are internally defined), and
b. Scalable implementation was neither feasible nor cost-e¤ective because it
needed call state full proxies.
32
1 The ideas and viewpoints presented here belong solely to Bhumip Khasnabish, Massachusetts,
USA.
Implementing Voice over IP. Bhumip Khasnabish
Copyright
2003 John Wiley & Sons, Inc.
ISBN: 0-471-21666-6
adaptation/translation gateways (MGs). It also recommends a protocol called
MGCP (media gateway control protocol, RFC 2705, 1999), which was the
result of a merger of SGCP (simple gateway control protocol) and IPDC (IP
device control) protocol. MGCP supports PSTN evolution by allowing inter-
working with circuit-switched networks and devices (analog and digital POTS
phones) via the following predefined endpoints: (a) access and residential GWs,
and integrated network access server and VoIP GWs; (b) GWs supporting
ISUP and multifrequency-type trunks; and (c) announcement servers and net-
work access servers.
In order to provide seamless interoperability of call and service control
between PSTN and next-generation (packet-based) network domains, the
MGC needs to exchange control messages reliably and securely to the SS7
network via the signaling gateway (SG; it can use the SCTP protocol, RFC
2960, as discussed later). Note that in the PSTN network, the call control and
signaling intelligence reside in the SS7 network.
EVOLUTION OF VoIP SIGNALING PROTOCOLS
33
MGCP is currently enjoying the widespread approval of cable TV (CATV)-
based VoIP service providers (e.g., see PKT-SP-EC-MGCP-I04-011221.pdf at
www.packetcable.com/specifications/). Both IETF and ITU-T’s study group 9
(Integrated Broadband Cable and Television Networks Study Group) are con-
sidering approval of the extensions of MGCP (MGCP v2, RFC 2705-bis, etc.).
MGCP is also evolving to ITU-T’s H.248 recommendation [6,7] and IETF’s
Media gateway control protocol (RFCs 3054, 3015, and 2805).
SWITCH-BASED VERSUS SERVER-BASED VoIP
For switch-based VoIP services, interworking with the existing PSTN switches,
networks, and terminals is desirable. In such scenarios, H.225 and H.245 are
well-established signaling and media control protocols under the H.323
umbrella protocol. Note that H.323 defines IP-PSTN GWs, call controller or
GK, terminal equipment (TE), and multipoint control units (MCUs) as the
messages exchanged between the TE/GW and the GK. TEs can use RAS for
discovering a GK or to register/deregister with a GK. A GK uses the RAS
messages to monitor the endpoints within a zone and to manage the associated
resources.
H.245 defines in-band media and conference control protocols for call
parameter exchange and negotiation. These parameters include audiovisual
mode and channel, bit rate, data integrity, delay, and so on. They provide a
set of control functions for multiparty multimedia conferencing, and can also
determine the master/slave relationship between parties to open/close logical
channels between the endpoints. In Figure 2-7 I showed the functions and rel-
ative positions of H.225 and H.245 with reference to ISO’s open system inter-
connection (OSI) stack [1]. Figure 3-2 shows the protocol sequence for estab-
lishment of a real-time H.323 voice communication session from one PSTN
phone to another over an IP network. Note that in this diagram, ARQ stands
for Admission Request, ACF for Admission Confirm, LRQ for Location
Request, and LCF for Location Confirm. Ingress and egress gateways are
indicated by IGW and EGW, respectively. Ingress and egress gatekeepers are
indicated by IGK and EGK, respectively.
SESSION INITIATION PROTOCOL (SIP)
SIP (IETF’s RFC 3261) refers to a suite of call setup and media mapping pro-
tocols for multimedia (including voice) communications over a wide area net- Figure 3-1 Network elements and their interconnection using a LAN in an H.323 zone.
Note that the PBX (PSTN) is outside the scope of H.323 and is shown to demonstrate
the interoperability of H.323 with PSTN.
SESSION INITIATION PROTOCOL
Figure 3-2 Message exchange for setting up an H.323-based VoIP session from one
PSTN phone to another over an IP network.
36
EVOLUTION OF VoIP SIGNALING PROTOCOLS
SIP architectural elements include (a) user agents (UA): client (UAC) or
server (UAS) and (b) network servers: redirection, proxy, or registrar. The
client or end device in SIP includes both the client and the server; hence, a call
participant (end device) may either generate or receive requests. SIP requests
can traverse many proxy servers. Each proxy server may receive a request and
then forward it to the next-hop server, which may be another proxy server or
the destination UA server. A SIP server may act as a redirect server as well. A
redirect server informs the client about the next-hop server so that the client
can contact it directly.
Figure 3-3 shows the message exchange for a SIP-based call setup. Note that
the number of messages that need to be exchanged to set up a SIP session is
smaller than that for an H.323 session (Fig. 3-2). As of 2001, both software-
based (running in a PC) and hardware-based SIP and IP phones were available.
For call routing over a large IP network, SIP may use the TRIP (telephony
routing over IP, a work-in-progress in IETF’s IPTel WG, RFC 2871) protocol
to locate the server to which a call should be routed. For routing a call to a
PSTN terminal (POTS phone), it may be necessary to use the ENUM (elec-
tronic numbering, RFC 2916) protocol. ENUM converts E.164 telephony
address to IP address (using an enhanced DNS server) and vice versa.