Requirements vary when choosing embedded CPU cores for system-on-a-chip
designs. For DVD encoding, system engineers should emphasize an
embedded CPU's level of flexibility and programmability. It's
important to select an embedded CPU that gives engineers top-level
control of the entire coding process in a high-level language
(HLL) such as C or C++.
Being equipped with a basic understanding of how the encoding
engines operate makes it relatively easy to modify an encoding
algorithm through basic reprogramming. Programmers then can get
immediate feedback on the video quality. This kind of CPU programmability
via C/C++ is critical for supporting an assortment of different
algorithms, including adaptive ones, to provide designers with
an ample range of product differentiation.
A completely hardware-based architecture, on the other hand, locks
systems designers into a single algorithm. Implementing a newer
algorithm or performing alterations to an existing one requires
a complete hardware change, thus incurring major design-engineering
costs. Even architectures that are programmable may impose similar
restraints if the microcoding isn't user-friendly.
It also is best not to get bogged down by low-level, mundane,
fixed tasks like discrete-cosine-transform (DCT) and variance
calculations. Designers should choose an embedded CPU that lets
them operate at a higher level, focusing on issues that affect
video quality and/or bit rate, such as quantization selection
or mode decision.
Quantization and Quality
Quantization (Quant) selection, in particular, plays a crucial
role in DVD video quality. It's important to have the necessary
CPU flexibility and programmability to target the proper levels
of Quant selection, not only for video quality, but also for system
differentiation. Today, there may be no single best viable algorithmic
equation that can produce an "optimum" quantization value. But
through the programmability features of an embedded CPU, system
engineers have the ability to easily incorporate tomorrow's algorithms,
allowing the product to improve as compression technology evolves.
To understand why Quant selection is so crucial to video quality,
you must understand that real-world video compression is a lossy
process. A certain amount of image detail must be sacrificed to
achieve DVD bit rates. You simply can't have high compression
without significant loss of data. But once this data is lost,
it's gone forever. Ideally, this loss is introduced where it's
least noticeable. That' what quantization is all about--it allows
you to decide how much loss you can introduce into part of the
image or picture.
In MPEG, quantization takes place in the frequency domain. An
8-by-8 block of pixels is first converted to an 8-by-8 block of
frequency coefficients by the Discrete Cosine Transform (DCT).
The DCT itself is a completely reversible process, meaning that
the original coefficients can be converted back to the original
8-by-8 block of pixels using an Inverse DCT (or IDCT), without
any loss.
The quantization process essentially involves dividing these coefficients
by a Quant value during encoding, and then multiplying by the
same Quant value during decoding. This does two things to the
original frequency coefficients: First, it reduces the accuracy
of the larger values. In general, this type of "loss" is very
minor, and typically unnoticeable, after conversion back to the
spatial domain. The second effect is the complete elimination
of the smaller coefficients. This is much more serious, because
it can introduce very noticeable losses in the spatial domain,
especially if it involves a lower frequency.
Although the most noticeable distortion results from eliminating
coefficients, this also contributes most to higher compression
ratios. And there's the rub. To achieve high compression ratios,
you have to start throwing away coefficients--a lot of them. At
a bit rate of 4 Mbits/s, an average I-block uses only about six
coefficients. Some blocks need more and some need fewer. But which
ones? The trick to successful quantization is finding a way to
eliminate all coefficients that are the least noticeable to the
human eye.
Ultimately, the entire quantization algorithm boils down to a
sophisticated modeling of the human visual system. Some kinds
of data loss are more tolerable than others. While this process
may sound simple, it's highly complex. Still, the basic principle
is to throw away enough data to meet the compression required,
but do it in a way that's least noticeable. It's a tricky proposition;
there are many different ways to perform Quant selection, and
engineers are always finding better algorithms. That's the reason
this particular encoding function should be performed in software
and not in hardware.
Variable Bit Rate
Embedded CPU flexibility and programmability also play a part
in attaining video quality goals in a DVD system design. Consider
that most DVD systems are based on variable bit rate (VBR). The
instantaneous bit rate of a DVD system varies continuously with
the complexity of the image being created. This allows the encoded
video quality to remain relatively constant. The average bit rate
for DVD is 4.7 Mbits/s. But actual data rates can slip below 2
Mbits/s and accelerate to a maximum of over 10 Mbits/s. Because
a DVD player/recorder is a "closed-loop" system, this is the method
used to generate high-quality images from the DVD disk.
In contrast, a broadcast application is an open-loop system with
severe constraints on channel bandwidth and buffers. In this case,
the bit rate control of choice is Constant Bit Rate (CBR). Think
of this as the antithesis of VBR.
In the real world, VBR and CBR actually represent the endpoints
of an entire range of algorithmic possibilities. The specific
algorithm selected depends on various system parameters, and,
as in Quant selection, these algorithms are constantly evolving.
With that in mind, it's simply not smart engineering to hardwire
the encoding function with the latest algorithm. A single hardware
platform that can easily accommodate future algorithmic refinements
is not only more economical, but will also accelerate the evolution
of video-encoding technology.
Each To Their Own Duties
In the encoder circuitry shown in Figure 1, the embedded CPU (see "Embedded CPU Core Is Programmer Friendly below,") controls the bit-stream generation and monitors the results of
the encoder's various subengines. These include such blocks as
motion estimation, mode decision calculations, quantization/inverse
quantization, and the variable length encoder (VLE). The CPU tracks
and reads the motion vectors in a register after motion estimation
is performed. Likewise, the same is done with mode decision. This
module makes the necessary calculations, and afterward, it stores
the results in several registers. The CPU monitors the calculations,
and then via software arrives at a mode decision and a quantization
value.
In this encoding application, a custom coprocessor performs variable
length encoding (VLE). This encoding operation is a reversible
and lossless procedure for coding. It assigns shorter code words
to frequent events and longer code words to less-frequent events,
thereby achieving further compression. Huffman coding is the most
often utilized form of VLE, due to its simplicity and efficiency
in reducing the number of bits necessary to encode without losing
information.
Several key attributes contributing to design flexibility differentiate
one encoder from another. They include quantization value selection,
rate control, how many frames are being stored, the number of
bidirectional or interpolated pictures (B frames) between anchors,
or whether or not original and/or reconstructed data is being
used for encoding.
The embedded CPU in this encosing application also supports syntax
generation for all six MPEG layers. Each layer supports either
a signal-processing or a system function (see "Two Paths For MPEG Syntax Generation below,").
To sum things up, the ideal encoder combines programmable and
hardwired functions to achieve the best possible cost/performance
trade-off. It performs math-intensive, well-defined functions
in hardware using hardwired "subengines," while it will make programmable
those functions and decisions that allow engineers to differentiate
the end product.
Requirements vary when choosing embedded CPU cores for system-on-a-chip
designs. For DVD encoding, system engineers should emphasize an
embedded CPU's level of flexibility and programmability. It's
important to select an embedded CPU that gives engineers top-level
control of the entire coding process in a high-level language
(HLL) such as C or C++.
Being equipped with a basic understanding of how the encoding
engines operate makes it relatively easy to modify an encoding
algorithm through basic reprogramming. Programmers then can get
immediate feedback on the video quality. This kind of CPU programmability
via C/C++ is critical for supporting an assortment of different
algorithms, including adaptive ones, to provide designers with
an ample range of product differentiation.
A completely hardware-based architecture, on the other hand, locks
systems designers into a single algorithm. Implementing a newer
algorithm or performing alterations to an existing one requires
a complete hardware change, thus incurring major design-engineering
costs. Even architectures that are programmable may impose similar
restraints if the microcoding isn't user-friendly.
It also is best not to get bogged down by low-level, mundane,
fixed tasks like discrete-cosine-transform (DCT) and variance
calculations. Designers should choose an embedded CPU that lets
them operate at a higher level, focusing on issues that affect
video quality and/or bit rate, such as quantization selection
or mode decision.
Quantization and Quality
Quantization (Quant) selection, in particular, plays a crucial
role in DVD video quality. It's important to have the necessary
CPU flexibility and programmability to target the proper levels
of Quant selection, not only for video quality, but also for system
differentiation. Today, there may be no single best viable algorithmic
equation that can produce an "optimum" quantization value. But
through the programmability features of an embedded CPU, system
engineers have the ability to easily incorporate tomorrow's algorithms,
allowing the product to improve as compression technology evolves.
To understand why Quant selection is so crucial to video quality,
you must understand that real-world video compression is a lossy
process. A certain amount of image detail must be sacrificed to
achieve DVD bit rates. You simply can't have high compression
without significant loss of data. But once this data is lost,
it's gone forever. Ideally, this loss is introduced where it's
least noticeable. That' what quantization is all about--it allows
you to decide how much loss you can introduce into part of the
image or picture.
In MPEG, quantization takes place in the frequency domain. An
8-by-8 block of pixels is first converted to an 8-by-8 block of
frequency coefficients by the Discrete Cosine Transform (DCT).
The DCT itself is a completely reversible process, meaning that
the original coefficients can be converted back to the original
8-by-8 block of pixels using an Inverse DCT (or IDCT), without
any loss.
The quantization process essentially involves dividing these coefficients
by a Quant value during encoding, and then multiplying by the
same Quant value during decoding. This does two things to the
original frequency coefficients: First, it reduces the accuracy
of the larger values. In general, this type of "loss" is very
minor, and typically unnoticeable, after conversion back to the
spatial domain. The second effect is the complete elimination
of the smaller coefficients. This is much more serious, because
it can introduce very noticeable losses in the spatial domain,
especially if it involves a lower frequency.
Although the most noticeable distortion results from eliminating
coefficients, this also contributes most to higher compression
ratios. And there's the rub. To achieve high compression ratios,
you have to start throwing away coefficients--a lot of them. At
a bit rate of 4 Mbits/s, an average I-block uses only about six
coefficients. Some blocks need more and some need fewer. But which
ones? The trick to successful quantization is finding a way to
eliminate all coefficients that are the least noticeable to the
human eye.
Ultimately, the entire quantization algorithm boils down to a
sophisticated modeling of the human visual system. Some kinds
of data loss are more tolerable than others. While this process
may sound simple, it's highly complex. Still, the basic principle
is to throw away enough data to meet the compression required,
but do it in a way that's least noticeable. It's a tricky proposition;
there are many different ways to perform Quant selection, and
engineers are always finding better algorithms. That's the reason
this particular encoding function should be performed in software
and not in hardware.
Variable Bit Rate
Embedded CPU flexibility and programmability also play a part
in attaining video quality goals in a DVD system design. Consider
that most DVD systems are based on variable bit rate (VBR). The
instantaneous bit rate of a DVD system varies continuously with
the complexity of the image being created. This allows the encoded
video quality to remain relatively constant. The average bit rate
for DVD is 4.7 Mbits/s. But actual data rates can slip below 2
Mbits/s and accelerate to a maximum of over 10 Mbits/s. Because
a DVD player/recorder is a "closed-loop" system, this is the method
used to generate high-quality images from the DVD disk.
In contrast, a broadcast application is an open-loop system with
severe constraints on channel bandwidth and buffers. In this case,
the bit rate control of choice is Constant Bit Rate (CBR). Think
of this as the antithesis of VBR.
In the real world, VBR and CBR actually represent the endpoints
of an entire range of algorithmic possibilities. The specific
algorithm selected depends on various system parameters, and,
as in Quant selection, these algorithms are constantly evolving.
With that in mind, it's simply not smart engineering to hardwire
the encoding function with the latest algorithm. A single hardware
platform that can easily accommodate future algorithmic refinements
is not only more economical, but will also accelerate the evolution
of video-encoding technology.
Each To Their Own Duties
In the encoder circuitry shown in Figure 1, the embedded CPU (see "Embedded CPU Core Is Programmer Friendly below,") controls the bit-stream generation and monitors the results of
the encoder's various subengines. These include such blocks as
motion estimation, mode decision calculations, quantization/inverse
quantization, and the variable length encoder (VLE). The CPU tracks
and reads the motion vectors in a register after motion estimation
is performed. Likewise, the same is done with mode decision. This
module makes the necessary calculations, and afterward, it stores
the results in several registers. The CPU monitors the calculations,
and then via software arrives at a mode decision and a quantization
value.
In this encoding application, a custom coprocessor performs variable
length encoding (VLE). This encoding operation is a reversible
and lossless procedure for coding. It assigns shorter code words
to frequent events and longer code words to less-frequent events,
thereby achieving further compression. Huffman coding is the most
often utilized form of VLE, due to its simplicity and efficiency
in reducing the number of bits necessary to encode without losing
information.
Several key attributes contributing to design flexibility differentiate
one encoder from another. They include quantization value selection,
rate control, how many frames are being stored, the number of
bidirectional or interpolated pictures (B frames) between anchors,
or whether or not original and/or reconstructed data is being
used for encoding.
The embedded CPU in this encosing application also supports syntax
generation for all six MPEG layers. Each layer supports either
a signal-processing or a system function (see "Two Paths For MPEG Syntax Generation below,").
To sum things up, the ideal encoder combines programmable and
hardwired functions to achieve the best possible cost/performance
trade-off. It performs math-intensive, well-defined functions
in hardware using hardwired "subengines," while it will make programmable
those functions and decisions that allow engineers to differentiate
the end product.