Download this article in PDF format.
Over the years, we’ve seen tremendous improvements to the process of capturing sound in digital form and then playing it back for human ears. However, lossy compression, which cuts file sizes and ultimately makes music portable, means the lower quality of MP3 audio has become accepted as the norm by the millions who grew up listening to it.
While CDs, with their “standard-definition” audio, offer better sound quality than MP3s, we’re now seeing the development of high-resolution audio. This claims to be better quality than CDs and give music more room to breathe.
Regardless of what path high-resolution audio takes, more of us could soon be switching from MP3 to the middle ground of standard-definition/CD-quality audio, once it’s able to match the convenience of MP3s.
Introducing High-Resolution Audio
Ever since the rise of CDs in the 1990s, people have disagreed over whether digital media can genuinely capture the key qualities of live music. Much of the criticism centers on the lossy files, which have been compressed to enable faster downloads.
This is why some in the music industry are keen to make high-resolution audio a genuine option for portable listening. And while this would be a significant improvement over lossy formats, it’s not necessarily the best thing to do. Given that many people don’t even feel the need to buy music at CD quality (standard definition), is investing in better-than-CD-quality audio a worthwhile investment for the industry?
Understanding Audio Quality
It’s important we understand some key things about audio quality. First, there’s sample rate and bit depth (Fig. 1). The former is the number of times per second that an analog waveform gets captured (sampled) during recording. A sample captures the loudness (amplitude) of the wave. Bit depth, meanwhile, is the size of each sample in bits—the more bits per sample, the more detailed it is. Both are determined when you first record the audio. CDs currently have a 44.1-kHz sample rate and 16-bit bit depth.
1. The red line is the analog signal, and the blue dots indicate the samples captured by a digital recorder. The bit depth is shown vertically and the sample rate horizontally. (Source: Wikipedia.org)
The other area to understand is compression, which is about bit rate. Bit rate is the number of bytes written to the audio file for every second of sound. A file’s native bit rate is the product of its sample rate and bit depth for each audio channel (stereo being two channels). However, a file’s actual bit rate depends on the way it is compressed and encoded.
Audio Compression and Common File Formats
Audio file compression is commonplace, primarily to enable large amounts of music to fit on portable devices. Digital music usually starts life as a large WAVE file with a 1411.4-kb/s bit rate and gets shrunk down to MP3 format (maximum bit rate of 320 kb/s). Other popular portable formats are Vorbis and Free Lossless Audio Codec (FLAC).
This compression gets done using software encoders, which contain algorithms that decide what data can be taken out without having a major effect on the audio. These algorithms differ between encoders, which is why a 128-kb/s Vorbis file could sound better than a 128-kb/s MP3.
FLAC has become a popular choice for delivering high-quality audio at relatively small file sizes, thanks to its efficient lossless compression and the fact that it has no licensing costs.
Meanwhile, Apple has its propriatary audio formats, including Advanced Audio Coding (AAC), similar to MP3. AAC is used for lossy compression, typically for music destined for one of the company’s portable products. There’s also the Apple Lossless Audio Codec, which, like FLAC, enables lossless compression. This is aimed at Macs, where there’s typically more storage available than on a mobile device. The other Apple format to note is Audio Interchange File Format (AIFF), which is uncompressed, similar to WAVE.
The Dawn of High-Definition Audio in the Mainstream?
Having been exposed to numerous low-quality files, some audio aficionados have an intense dislike of digital or portable audio. Others are keen to see old master copies digitally resampled and released at a higher quality than what’s currently available.
Neil Young and his company Pono have been pioneers in this regard—pushing the uptake of high-definition music that uses lossless compression. Pono’s music player uses 192-kHz, 24-bit files with bit rates starting at 1411 kb/s and going as high as 9216 kb/s (Fig. 2).
Analog masters, currently held on tape, can be digitized at the required levels for high-resolution audio. And while a lot of music now gets recorded at high resolutions, in the early days of digital recording, the masters were taken at the same quality as CDs. This audio will, therefore, never be of the quality required by the Pono player. So if high-res audio takes off in the way Young wants it to, we will need to see more music being produced at high resolution.
There has been some traction in the music industry around high-resolution audio, including moves to create standards for it. However, the question remains: Will music fans be sold on it?
The Price of High-Resolution Audio
The tipping point will be when sufficient numbers of consumers decide the higher audio quality is worth the cost. That’s because as you increase the quality, you need more space to store the files. A typical Pono-quality file—192-kHz, 24-bit FLAC format—takes up the best part of 200 MB, where the same song at 44.1kHz, 16-bit FLAC needs less than 7 MB. That’s quite a difference in the number of songs you can fit on your music player or smartphone.
Admittedly, concerns over storage will diminish as the cost per gigabyte and physical size of memory continue to drop. However, at least in the short term, the amount of space required to facilitate what many perceive as a small improvement in audio quality is likely to limit the uptake of high-resolution music.
Pono’s portable music player includes special hardware, which claims to make any file sound better. However, this pushes up the cost: A Pono player will set you back $399. Then it’s another $300 for high-quality headphones, while a headphone amplifier, which enables you to hear the full quality of the Pono store’s high-resolution music, is $200 more. A hi-res album from the store is around $20.
2. The Pono high-resolution music player uses 192-kHz, 24-bit files with bit rates from 1411 kb/s up to 9216 kb/s.
History suggests that audiophiles are happy to spend more money on high-quality sound systems. However, whether that remains true for high-resolution audio will depend on whether they can hear an improvement.
What’s the Benefit of High-Resolution Audio?
The difference between standard-definition and high-definition video becomes more evident as you use larger displays. On the other hand, audio is different. A higher bit depth cuts down quantization error. The main beneficiaries of 24-bit audio are the sound engineers, who can take advantage of the greater headroom to reduce noise and create more “space” in the overall music. However, to the end listener, the increase in bit depth is hardly noticeable, while the advantages of higher sample rates are almost non-existent. Why is this?
Nyquist’s Theorem states that to sample a signal accurately, the frequency of the sample must be greater than two times the source-signal frequency. A 44.1-kHz sampling rate is more than sufficient (>2X) to sample any audible frequencies, which are all below the 20-kHz limit of human hearing. Hence, 44.1 kHz became the de facto audio industry sampling-rate standard for CD-quality audio. Sample rates that exceed this only capture frequencies we can’t hear. This piece on Xiph.org looks at this in much more detail.
Audio engineers will find value in higher sample rates during the mixing process, as they can help with speed adjustment and pitch correction, among other things. However, during audio mastering, the engineers will typically use filters to remove any inaudible sounds.
To the person listening to the music, therefore, the sample rate and bit rate associated with high-res audio might deliver only subtle enhancements to the listening experience. Moreover, a high-resolution album can cost $10 more than a CD-quality one. Multiply that out over an entire music collection, and it’s a significant additional investment.
A Pathway to Better Mainstream Audio Quality
This brings us back to the heart of the issue, which is the fact that consumers are happy to compress their music if it means they can listen to it on the go. They either don’t notice (or don’t care) that what they’re listening to doesn’t sound as good as the CD they ripped it from.
So rather than trying to promote high-resolution audio, would it not be more worthwhile for campaigners to push the use of lossless compression as a way of enabling people to listen to CD-quality music on their phones? Improving the ways to compress audio and higher network speeds will mean we can download FLAC-quality audio more quickly in the future. This will likely put an end to lossy compression and mean consumers get better-quality music on their portable devices, without paying for quality they can’t hear.
Better-quality audio files are only one part of the story, of course. The other is small, portable, high-quality hardware. Thanks to its custom data converters, the compact Pono player allegedly plays CD-quality files better than an actual CD player. This alone is a plus when it comes to audio quality. The ideal situation would be a player that improves audio in this way and lets consumers fit the large number of songs they typically carry around onto a portable device.
As far as music sound quality goes, we’re already at the peak. Spending money on high-resolution music files is a waste, either because your ears aren’t able to tell the difference, or because a lot of audio equipment isn’t able to deliver the higher-quality sound.
However, the great thing about digital audio is that there’s now a wider variety of quality levels available, meaning everyone can choose exactly how much quality they’re willing to pay for.