Embedded CPU Core Is Programmer-Friendly
The 81-MHz, 71-MIPS 16-/32-bit TinyRISC TR4101 embedded microprocessor
core consists of a register file, system control coprocessor (CPO),
arithmetic logical unit (ALU), shifter, CBus interface, and a
computational bolt-on (CBO) interface (Figure 2). The register file contains general-purpose registers, supplies
source operands to the execution units of the encoding function
and handles the storage of results to the target registers.
The CPO processes exceptions, which includes interrupts; the ALU
performs the necessary arithmetic and logical operations in support
of the encoding functions and does address calculations; and the
shifter performs shift operations. The CBO interface gives the
systems engineer a way to insert specialized arithmetic instructions
to the microprocessor. For example, an embedded CPU can attach
a multiply-divide unit (MDU) via the CBO interface to perform
such encoding-support functions as complex rate-control calculations.
The CBus interface passes data to and from the core. Thus, systems
engineers can attach up to three tightly coupled special-purpose
coprocessors that enhance the embedded microprocessor's general-purpose
computational power. By taking this approach, high-performance,
application-specific hardware is made directly accessible to a
programmer at the instruction-set level.
The embedded CPU's code is written in C/C++, then compiled into
instructions and stored in memory. Besides handling syntax generation
for all MPEG layers, the code also handles frame control and type,
rate control, audio and system-stream multiplexing, and parts
of the mode decision process.
Two Paths For MPEG Syntax Generation
The MPEG syntax layers correspond to a hierarchical structure.
A sequence constitutes the top layer of the video-coding hierarchy,
consisting of a header and a number of group of pictures (GOPs).
A GOP is a random-access point, meaning it's the smallest coding
unit that can be independently encoded within a sequence. It contains
a header and a number of pictures. The GOP header features time
and editing information.
The TinyRISC embedded TR4101 embedded microprocessor core (see the figure in "Embedded CPU Core Is Programmer-Friendly above,") supports syntax generation for all six MPEG layers, each of which
supports either a signal-processing or a system function. The
layers are: system, sequence, GOP, picture, slice, and macroblock
(Figure 3).
The three types of pictures are intracoded (I), predictive-coded
(P), and bidirectionally predictive-coded (B). "I" pictures are
coded without reference to any other pictures; "P" pictures are
coded using motion-compensated prediction from the previous I
or P reference pictures; and "B" pictures are coded using motion
compensation from a previous and a future I or P picture. Pictures
consist of a header and one or more slices. The picture header
includes time, picture type, and coding information.
A slice provides immunity to data errors. If the encoded bit stream
is unreadable within a picture, the decoder can recover by waiting
for the next slice, without having to drop an entire picture.
Slices consist of a header and one or more macroblocks. The slice
header contains position and quantizer scale information.
A macroblock is the basic unit for motion compensation and quantizer
scale changes. In MPEG 2, the block can be either field or frame
coded. Each macroblock consists of a header and six component
8-by-8 blocks; four blocks of luminance, one block of Cb chrominance,
and one block of Cr chrominance. The macroblock header contains
quantizer scale and motion compensation information. A macroblock
has a 16-pixel by 16-line section of luminance component and the
spatially corresponding 8-pixel by 8-line section of each chrominance
component.
Blocks are the basic coding unit, and the DCT is applied at this
block level. Each block contains 64 component pixels arranged
in an 8-by-8 order. After the DCT, the resulting 8-by-8 block
of coefficients are quantized, zig-zagged, grouped in run-level
pairs (the number of zero coefficients preceding each non-zero
coefficient is the "run"; the nonzero coefficient, itself, is
the "level") and finally, Huffman encoded. Because these operations
are very math intensive, only the Huffman coding is performed
by the CPU.
The embedded microprocessor also handles the rate-control function
of the encoder. Rate-control algorithms are feedback mechanisms
that regulate the number of bits generated during the transform
coding process over a given elapsed time. They're typically divided
into two groups--fixed or variable.
For a fixed data rate, the output bit stream must be constant
to ensure the encoder operates properly with a fixed-rate communications
channel, such as satellite. Equally important, it must also ensure
that decoder receiving the fixed-rate bit stream operates properly.
Over a period of time determined by the size of the encoder's
output or channel buffer, the average number of bits per macroblock
must be held below a fixed threshold to prevent the decoder's
video output from underflowing or overflowing. As a result, the
quality of the video varies inversely with the image complexity.
When in a variable data-rate mode, the instantaneous bit rate
is allowed to vary continuously in proportion to the level of
complexity of the image. This is also known as "constant quality"
bit-rate encoding. Variable rate control can be useful when there
are multiple channels being multiplexed onto a single transport
stream, or in a closed-loop system such as DVD. By knowing the
type of the source material in advance (using so-called forward-analysis
techniques), the encoder can optimize the image compression, based
on statistics and image complexity, and set priorities.
Embedded CPU Core Is Programmer-Friendly
The 81-MHz, 71-MIPS 16-/32-bit TinyRISC TR4101 embedded microprocessor
core consists of a register file, system control coprocessor (CPO),
arithmetic logical unit (ALU), shifter, CBus interface, and a
computational bolt-on (CBO) interface (Figure 2). The register file contains general-purpose registers, supplies
source operands to the execution units of the encoding function
and handles the storage of results to the target registers.
The CPO processes exceptions, which includes interrupts; the ALU
performs the necessary arithmetic and logical operations in support
of the encoding functions and does address calculations; and the
shifter performs shift operations. The CBO interface gives the
systems engineer a way to insert specialized arithmetic instructions
to the microprocessor. For example, an embedded CPU can attach
a multiply-divide unit (MDU) via the CBO interface to perform
such encoding-support functions as complex rate-control calculations.
The CBus interface passes data to and from the core. Thus, systems
engineers can attach up to three tightly coupled special-purpose
coprocessors that enhance the embedded microprocessor's general-purpose
computational power. By taking this approach, high-performance,
application-specific hardware is made directly accessible to a
programmer at the instruction-set level.
The embedded CPU's code is written in C/C++, then compiled into
instructions and stored in memory. Besides handling syntax generation
for all MPEG layers, the code also handles frame control and type,
rate control, audio and system-stream multiplexing, and parts
of the mode decision process.
Two Paths For MPEG Syntax Generation
The MPEG syntax layers correspond to a hierarchical structure.
A sequence constitutes the top layer of the video-coding hierarchy,
consisting of a header and a number of group of pictures (GOPs).
A GOP is a random-access point, meaning it's the smallest coding
unit that can be independently encoded within a sequence. It contains
a header and a number of pictures. The GOP header features time
and editing information.
The TinyRISC embedded TR4101 embedded microprocessor core (see the figure in "Embedded CPU Core Is Programmer-Friendly above,") supports syntax generation for all six MPEG layers, each of which
supports either a signal-processing or a system function. The
layers are: system, sequence, GOP, picture, slice, and macroblock
(Figure 3).
The three types of pictures are intracoded (I), predictive-coded
(P), and bidirectionally predictive-coded (B). "I" pictures are
coded without reference to any other pictures; "P" pictures are
coded using motion-compensated prediction from the previous I
or P reference pictures; and "B" pictures are coded using motion
compensation from a previous and a future I or P picture. Pictures
consist of a header and one or more slices. The picture header
includes time, picture type, and coding information.
A slice provides immunity to data errors. If the encoded bit stream
is unreadable within a picture, the decoder can recover by waiting
for the next slice, without having to drop an entire picture.
Slices consist of a header and one or more macroblocks. The slice
header contains position and quantizer scale information.
A macroblock is the basic unit for motion compensation and quantizer
scale changes. In MPEG 2, the block can be either field or frame
coded. Each macroblock consists of a header and six component
8-by-8 blocks; four blocks of luminance, one block of Cb chrominance,
and one block of Cr chrominance. The macroblock header contains
quantizer scale and motion compensation information. A macroblock
has a 16-pixel by 16-line section of luminance component and the
spatially corresponding 8-pixel by 8-line section of each chrominance
component.
Blocks are the basic coding unit, and the DCT is applied at this
block level. Each block contains 64 component pixels arranged
in an 8-by-8 order. After the DCT, the resulting 8-by-8 block
of coefficients are quantized, zig-zagged, grouped in run-level
pairs (the number of zero coefficients preceding each non-zero
coefficient is the "run"; the nonzero coefficient, itself, is
the "level") and finally, Huffman encoded. Because these operations
are very math intensive, only the Huffman coding is performed
by the CPU.
The embedded microprocessor also handles the rate-control function
of the encoder. Rate-control algorithms are feedback mechanisms
that regulate the number of bits generated during the transform
coding process over a given elapsed time. They're typically divided
into two groups--fixed or variable.
For a fixed data rate, the output bit stream must be constant
to ensure the encoder operates properly with a fixed-rate communications
channel, such as satellite. Equally important, it must also ensure
that decoder receiving the fixed-rate bit stream operates properly.
Over a period of time determined by the size of the encoder's
output or channel buffer, the average number of bits per macroblock
must be held below a fixed threshold to prevent the decoder's
video output from underflowing or overflowing. As a result, the
quality of the video varies inversely with the image complexity.
When in a variable data-rate mode, the instantaneous bit rate
is allowed to vary continuously in proportion to the level of
complexity of the image. This is also known as "constant quality"
bit-rate encoding. Variable rate control can be useful when there
are multiple channels being multiplexed onto a single transport
stream, or in a closed-loop system such as DVD. By knowing the
type of the source material in advance (using so-called forward-analysis
techniques), the encoder can optimize the image compression, based
on statistics and image complexity, and set priorities.