the mu-law method of quantization that is used in the digital telecommunication systems of North America and Japan. This method of quantization was adopted as the a-law algorithm for use in Europe and much of the rest of the world. Following the idea of allowing smaller step functions at lower amplitudes rather than at higher amplitudes, the mu-law and a-law provide a quasi-logarithmic scale. The voltage range has 16 segments (0 to 7 positive and 0 to 7 negative). Each segment has 16 steps for a total of 256 points on the scale. Starting with segment 0, which is closest to zero amplitude, the segments grow bigger toward the maximum amplitudes and the size of the steps increases. Within a segment, however, the size of the steps is linear. The result of using mu-law and a-law is a more accurate value for smaller amplitudes and a uniform signal-to-noise quantization ratio (SQR) across the input range. The ITU Telecommunication Standardization Sector (ITU-T) standards for companding include the a-law and mu-law in recommendation G.711. Note
By convention, when the PSTN communicates between a mu-law country and an a-law country, the mu-law country must change its signaling to accommodate the a-law country.
Content 2.2 Digitizing and Packetizing Voice 2.2.7 Common Voice Codec Characteristics Data compression squeezes data so that the data requires less bandwidth on data transmission channels. Most compression schemes take advantage of the fact that datastreams have a lot of repetition. For example, while a 7-bit ASCII code represents alphanumeric characters, a compression scheme can use a 3-bit code to represent the eight most common letters. In voice, there are long stretches of silence that can be replaced by a value that indicates how much silence there is, or how long the silence exists. Similarly, in graphic compression techniques, a value can replace white spaces in an image by indicating the amount of white space that is replaced. Early Plain Old Telephone Service (POTS) worked wholly on an analog infrastructure. Long distance calling was challenging primarily because of signal attenuation and line noise. Periodic amplification solved problems to some extent but also amplified the noise. When telephone companies converted their trunk lines to digital and used pulse code modulation (PCM) to digitize the signals, these problems virtually disappeared. The basic PCM technique is to use a coder-decoder (codec) to sample the amplitude of a voice signal 8000 times per second and then store the amplitude value as 8 bits of data. There is a formula to this storage procedure: 8000 samples/second × 8 bits/sample = 64,000 bits/sec The result is the basis for the entire telephone system digital hierarchy. Differential (or Delta) pulse-code modulation (DPCM) encodes the PCM values as differences between the current and the previous value. For audio, this type of encoding reduces the number of bits required per sample by about 25 percent compared to PCM. Adaptive DPCM (ADPCM) is a variant of DPCM that varies the size of the quantization step to allow further reduction of the required bandwidth for a given signal-to-noise ratio. To a large extent, ADPCM has replaced PCM. ADPCM uses a special encoding technique that reduces the data that is required to store each sample, transmitting only the difference between one sample and the next. An adaptive predictor algorithm predicts in advance how the signal will change. The prediction is usually very accurate. As samples vary, the predictor rapidly adapts to the changes. ADPCM provides 48 voice channels on a T1 line, which benefits customers who use such lines to interconnect their remote offices or connect their internal phone systems to the phone company switches. The table in Figure shows the most popular coding techniques by their bit rates that are standardized for telephony by the ITU-T in its G-series recommendations:
Content 2.2 Digitizing and Packetizing Voice 2.2.8 Selecting a Codec Using the Mean Opinion Score The most important characteristics of a codec are the required bandwidth for the procedure and the quality degradation caused by the analog-to-digital conversion and compression. The mean opinion score (MOS) is a system of grading the voice quality of the telephone connection that uses codec. The MOS is a statistical measurement of voice quality that is based on the judgment of several subscribers. Because MOS is graded by humans, the score is subjective. MOS uses a scale of 1 (bad) to 5 (excellent). An MOS of 5 represents direct conversation. The table in Figure provides a description of the five ratings. The table in Figure shows the MOS for the most popular coding techniques. Note
A newer, more objective measurement is quickly overtaking MOS scores as the industry quality measurement of choice for coding algorithms. Perceptual Speech Quality Measurement (PSQM), as per ITU standard P.861, provides a rating on a scale of 0 to 6.5, where 0 is best and 6.5 is worst. Many vendors implement PSQM in test equipment and monitoring systems. Some PSQM test equipment converts the 0-to-6.5 scale to a 0-to-5 scale to correlate to MOS. PSQM works by comparing the transmitted speech to the original input and yielding a score. Test equipment from various vendors is now capable of providing a PSQM score for a test voice call over a particular packet network.
Content 2.2 Digitizing and Packetizing Voice 2.2.9 A Closer Look at a DSP Figure shows a typical DSP module that might be used in a Cisco voice-enabled router. A digital signal processor (DSP) is a specialized processor used for telephony applications: