Additionally, programmable transmit and receive gain, silence compression, and packet-loss compensation algorithms are required. Independent dynamic speech-coder selection per channel increases system flexibility.
Typically managed by a host processor, the DSP system acts as a bidirectional gateway between a telephony interface such as PCM and a digital network. After the signal from the PCM interface is processed by the echo canceller, a voice/fax classifier forwards it to the appropriate software module for further compression (Fig. 3). Fax channels are demodulated to extract the payload, which is forwarded to the packet network as a bit stream. Voice channels are compressed by one of the speech coder modules. The intervals of silence are subject to a very high compression ratio for optimal bandwidth utilization. A DTMF relay preserves any tone signaling superimposed to the voice.
Simultaneously, data from the host interface is processed to reconstruct the original signal (Fig. 4). Fax channels are modulated before being relayed to the PCM interface. Voice channels are decoded, and the silence intervals are interpolated by the comfort noise generator. A bad-frame handler compensates for the lost voice packets to minimize the disturbance at the receiving end.
Sophisticated vocoders like the G.723.1 and G.729 are very complex algorithms to implement. In contrast to modems, which are based on filtering and correlations, these very nonregular DSP algorithms contain many vocoder-specific algorithms—stochastic codebook search, pitch prediction, parameters estimation, vector quantization, and others. Such algorithms are a general mix of control code and a lot of mathematical calculations.
Thanks to its configurable long-instruction-word (CLIW) architecture, conditional execution, and the memory destination orientation (nonload-store), the Carmel DSP core is an ideal candidate for a VoIP Gateway DSP engine (see "DSP Core's CLIW Breeds VLIW Performance," below). The CLIW architecture offers the right balance between scalar/superscalar DSPs that produce good code size (but moderate computational power) and the very-long-instruction-word (VLIW) architectures that provide good computational power (but inefficient code size).
With the CLIW, software can be designed so control-code and register initialization can be performed with regular instruction (good code density). Specifically, the inner loops will be written as long instructions, like VLIW, to minimize the MIPS count. Together with the conditional execution, the CLIW offers the most powerful architecture for vocoder implementation.
For example, the code listing shows a G.729a stochastic codebook inner loop implementation on the Carmel (three cycles), compared to a more conventional DSP core (14 to 16 cycles). Note that this loop consumes 10% to 20% of the overall G.729a computational load, while it is less then 1% of the vocoder's code. The algorithm processing load on the CARMEL DSP is:
| G.711 packetized PCM | 0.2 MIPS |
| G.723.1 vocoder | 7.5 MIPS |
| G.729a vocoder | 5.0 MIPS |
| G.726/727 vocoder | 5.0 MIPS |
| V.17 G3 fax relay | 6.5 MIPS |
| V.32bis modem relay | 9.0 MIPS |
| Line echo canceller | 1.5 MIPS |
| DTMF relay | 0.3 MIPS |
| Voice/fax/data classifier | 0.4 MIPS |
| Real-time scheduler | 0.5 MIPS overhead |
| Maximum load/channel | 10.4 MIPS |
This VoIP system design, based on the Carmel DSP core, supports a multichannel fax and voice-over-packet application. Optimized for maximum channel density, it can be easily scaled to a wide variety of solutions. Running at 250 MHz, the core can handle 24 channels of full T1 VoIP gateway. The DSP MIPS requirements for the Carmel-based implementation are nearly 50% lower than the MIPS required by other advanced DSPs.