Most common video signals require preprocessing before encoding by video-compression
codecs, which requires data to be in 420 planar format to achieve higher processing
performance. For example, broadcast standards such as NTSC and PAL may need
to be converted from an interlaced format to progressive scan, and they frequently
require the chrominance and luminance information to be reformatted as well.
In particular, video from CCD cameras is captured in an interlaced 4:2:2 interleaved format. Certain profiles of video-compression standards, however, accept input only in a progressively
scanned 4:2:0 format. In this case, interlacing artifacts must be
removed since interlaced video content can be quite challenging
for progressive encoders.
Engineers have a number of sophisticated de-interlacing algorithms to choose from, but not all applications require the highest
level of video quality. Moreover, sophisticated algorithms tend to
be compute-intensive, and developers always have a digital-signal-processor (DSP) MIPS budget to manage.
When the application doesn't require the highest video quality, resizing algorithms in hardware can be used for de-interlacing. This technique is particularly useful to save precious DSP
MIPS by offloading the 4:2:2 to 4:2:0 conversion and de-interlacing operations to other hardware. Surprisingly, resizing hardware sometimes achieves de-interlacing quality on par with
high-complexity de-interlacing algorithms after video compression is taken into account.
The simple method described in this article can be used to deinterlace video
applications. This technique works best when there's a significant amount of
motion in the video data frames, since still images tend to highlight the deficiencies.
LUMINANCE AND CHROMINANCE CODING
NTSC defines standard-definition (NTSC SD) resolution as 720 pixels per row,
480 pixels per column, and 30 frames per second. The information for each pixel
contains three components:
• Y is the luminance (luma) information
• Cb (U) is the blue color information
• Cr (V) is the red color information
When the NTSC standard was adopted, engineers faced both transmission bandwidth
and computing power constraints for encoding video streams. Since the human
eye is far more sensitive to luminance information, the NTSC standard lightened
the load by calling for the chrominance information to be horizontally down-sampled
by half.
Each captured frame from a CCD camera has 720 by 480 Y values, 360 by 480 U
values, and 360 by 480 V values. Each value is eight bits (a byte) in range
[0, 255], which makes each NTSC SD frame (720 + 360 + 360) X 480 = 691,200 bytes.
The Y/U/V components in the captured frame are typically interleaved, usually
in YUV 4:2:2 format. There are two ways to organize the data, but in the interest
of simplicity, assume the data is organized in UYVY interleaved 4:2:2 format
(Fig. 1).
As previously mentioned, most encoders require input video to
be in YUV 4:2:0 format. There are two distinctions between 4:2:2
interleaved data and 4:2:0 planar data.
In 4:2:0 format, chroma information is further down-sampled vertically by half
as well. That is, for each NTSC SD frame, U or V components each contain 360
X 240 bytes instead of 360 480 bytes. Each NTSC SD frame in 4:2:0 format is
518,400 bytes [(720 X 480) + (360 X 240 X 2)]. The additional chroma down-sampling
is required to balance real-time performance with acceptable picture quality.
Efficient implementations of video-compression standards also often require
that luma and chroma components are separated in memory, because the encoding
algorithms may process them in different ways. Figure
2 shows the NTSC SD video frame in 4:2:0 planar format.
INTERLACED ARTIFACTS
Interlace scanning involves scanning a picture twice, with one scan capturing
every even line and one scan capturing every odd line. The two captures are
separated by a small difference in time and then merged together to form a complete
frame.