IP packets, and consequently also VoIP packets, can be secured using IPsec. There are two IPsec modes: transport mode and tunnel mode. In either mode, the packets can be protected by the Authentication Header (AH), the Encapsulating Security Payload (ESP) header, or both headers. In tunnel mode, an additional IP header is generated, allowing the use of virtual private networks (VPNs). Additionally, IP packets or data-link frames can be tunneled over a variety of protocols. Examples for such tunnel protocols include these: Tunneling protocols and IPsec have some common characteristics. They all encapsulate the original packet or frame into another protocol. Adding the tunneling protocol header increases the size of the original packet, resulting in higher bandwidth needs. The extra bandwidth can be critical, especially for voice packets, because of high packet transmission rates and small packet size. The larger the size is of the additional headers, the greater the need is for extra bandwidth for VoIP packets.
IPsec and tunneling protocols add headers of different sizes. IPsec overhead depends on the use of the available headers (AH and ESP), the encryption or authentication algorithms that are used in these headers, and the IPsec mode (transport or tunnel mode). Because AH supports only authentication while ESP supports authentication and encryption, ESP is used more often. With Data Encryption Standard (DES) or Triple DES (3DES) used for encryption and Message Digest 5 (MD5) or Secure Hash Algorithm 1 (SHA-1) used for authentication, the ESP header adds 30 to 37 bytes in transport mode. When Advanced Encryption Standard (AES) is used as the encryption algorithm and AES-extended cipher block chaining (AES-XCBC) is used for authentication, 38 to 53 bytes are added in transport mode. ESP DES and 3DES require that the payload is rounded up to multiples of 8 bytes (resulting in 0 to 7 bytes of padding), while the ESP AES payload is rounded up to multiples of 16 bytes (resulting in 0 to 15 bytes of padding). In tunnel mode, an extra 20 bytes must be added for the additional IP header. L2TP or GRE add 24 bytes to the original PPP frame, MPLS adds 4 bytes to the original IP packet, and PPPoE adds an extra 8-byte PPPoE header between the Ethernet frame and the IP packet. Figure summarizes header sizes for different protocols. The example in Figure shows a company with two sites. The headquarters site is separated from the branch site by an untrusted network. IPsec is used by the routers that connect the sites over the untrusted network. IPsec ESP tunnel mode is used with 3DES encryption and SHA-1 authentication. IP phones that are located at each site use the G.729 codec with a default packetization period of 20 ms. RTP header compression is not enabled. During a voice call between the headquarters and the branch site, every 20 ms the IP phones encapsulate 20 bytes of digitized voice into RTP, UDP, and IP, resulting in IP packets of 60 bytes. When these VoIP packets are sent out to the untrusted network by the routers, the routers encapsulate each packet into another IP header and protect the packet using an ESP header. This process adds an additional 54 bytes (20 bytes for the extra IP header, 4 bytes of padding to reach a payload size of 64 bytes, and 30 bytes for the ESP header) to the original VoIP packets. The IPsec packet, which now transports the VoIP packet, has a size of 114 bytes, which is almost twice the size of the original VoIP packet.
When you are designing networks for VoIP, it is crucial to know the total bandwidth of a VoIP call that takes place across the link. Figure represents a simple VoIP network. This information is needed to determine the capacity of physical links and to properly deploy CAC and QoS. CAC limits the number of concurrent voice calls; this limit avoids oversubscription of the link, which causes quality degradation. QoS gives priority to voice packets, avoiding too-high delays that are caused by queuing, which again affects voice quality. To calculate the total bandwidth of a VoIP call, perform these steps summarized in Figure : Step 1 Gather required packetization information: First, you must determine the bandwidth of the codec that is used to digitize the analog signals. The codec bandwidth is specified in kilobits per second and is usually in the range of approximately 8 to 64 kbps. You also need the packetization period (specified in milliseconds) or the packetization size (specified in bytes). If you have the codec bandwidth and one of these two values, you can calculate the remaining value. Step 2 Gather required information about the link: The amount of overhead that will be added per packet on each link is the next piece of information you need. The amount of overhead used depends on whether or not cRTP is used, which data-link protocol is in use, and what the data-link overhead is per packet. IP, UDP, and RTP overhead is 40 bytes unless cRTP is used. If cRTP is used, the overhead is 2 (the default) or 4 bytes. Make sure to include the overhead (in bytes) of the data-link protocol that is used. Finally, you must know whether any other features that cause additional overhead are being used and how much overhead features are using. Additional features can be security features, such as VLANs, IPsec, or any special tunneling applications. Step 3 Calculate the packetization size or period: Depending on the voice device, you might know either the packetization period or the packetization size (determined in Step 1). Calculate the missing information based on the known value plus the codec bandwidth, also noted in Step 1. The packetization size is expressed in bytes; the packetization period is expressed in milliseconds. Step 4 Add together the packetization size and all headers and trailers: Add the overhead of IP, UDP, and RTP (or cRTP), data-link protocol, and any other protocols that you noted in Step 2 to the voice payload (packetization size), which you determined either in Step 1 or Step 3. All values must be in bytes. Step 5 Calculate the packet rate: Calculate how many packets will be sent per second by using the multiplicative inverse of the packetization period. Because the packet rate is specified in packets per second, make sure to convert the milliseconds value of the packetization period to seconds. Step 6 Calculate the total bandwidth: Multiply the total size of the packet or frame by the packet rate to calculate the total bandwidth. Because the packet size is specified in bytes and the bandwidth is specified in kilobits per second, you need to convert bytes to kilobits. Based on this procedure, you can calculate the bandwidth that is used by a VoIP call on a specific link. For planning the capacity of physical links, consider the maximum number of calls that are likely to be made at once and the bandwidth that is needed for applications other than VoIP. In addition, you must ensure that enough bandwidth is available for call setup and call teardown signaling. Although signaling messages need relatively little bandwidth, you should not forget to provision the bandwidth for signaling protocols (especially in your QoS