Some fascinating developments have reverberated within the world of audio this year (see "Multichannel Audio, DDS Keep DACs Humming" at www.electronicdesign.com, ED Online 11763). For one thing, the trend toward more channels in home-entertainment systems has somewhat reversed itself, thanks to ever-more sophisticated algorithms for "virtualizing" speakers.
A certain "spousal acceptance factor" worked against installing all of the
speakers that came with a 5.1 or 7.1 system. Lately, it's become possible to
create the impression that the audience is hearing sound from speakers that
aren't really there—hence the "virtual" moniker and a cessation of hostilities
between spouses across the land.
Also, interesting things happen when personal media players encounter the limited
bandwidths associated with wireless. At least one vendor is exploring what happens
when a cell phone becomes a media player. Not only that, cultural factors are
driving companies to serve unique submarkets around the globe.
For example, like most North Americans past a certain age, I generally keep my cell-phone ringer set to "tingle" mode. Yet younger cultures on this continent, along with other cultures elsewhere, venerate their ring tones and want everyone around them to hear when they've downloaded the latest. That leads to the need for a separate, relatively high-fidelity Class D audio amplifier just for the phone's ringtone speaker.
Finally, a number of recent silicon chips for the audio signal chain across the audio spectrum embody ideas or technical specs that just weren't available last January.
ALGORITHMS GALORE
Let's start by looking at that intersection of personal audio with wireless.
The latest standard is MPEG-4 aacPlus, also known as HE AAC, or just AAC Plus.
It combines the Advanced Audio Coding (AAC) standard that Apple uses in iPods
with Coding Technologies' Spectral Band Replication (SBR) and Parametric Stereo
(PS) technologies.
SBR and PS allow much lower bit rates than AAC alone (Fig.
1). That's particularly important in wireless applications, where streaming
audio shares the bandwidth with two-way communication modes. But even high-speed
landline connections can't guarantee long-term bit-rate consistency.
This has led to the use of "perceptual codecs," which use psychoacoustic algorithms to decide what information to leave out of the data stream. MP3 is a perceptual codec. But get down below 128 kbits/s (about 12:1 compression), and the music starts to sound funny.
Standardized in ISO/IEC 14496-3:2001/Amd.1:2003, SBR makes it possible to either
increase the audio bandwidth at a given bit rate (for music) or improve coding
efficiency at a given quality level (for speech). SBR is mainly a post-process,
though some pre-processing is performed in the encoder to guide the decoding
process.
In practice, the underlying coder handles the lower part of the spectrum, and the SBR decoder reconstructs the higher frequencies in the decoder based on an analysis of the lower frequencies. This is encoded as guidance information that's inserted into the encoded bitstream at a very low data rate.
SBR can enhance the efficiency of perceptual audio codecs by about 30%. When used with MP3, perceived stereo quality at 64 kbits/s is said to sound as good as conventional MP3 does at 100 kbits/s or higher.
In aacPlus, PS takes SBR a step further. The PS encoder extracts a parametric
representation of the stereo image of an audio signal and transmits a portion
of it along with the monaural signal in the bit stream. Based on the parametric
stereo information, the decoder then can regenerate the stereo image.