Most common video signals require preprocessing before encoding by video-compression codecs, which requires data to be in 420 planar format to achieve higher processing performance. For example, broadcast standards such as NTSC and PAL may need to be converted from an interlaced format to progressive scan, and they frequently require the chrominance and luminance information to be reformatted as well.
In particular, video from CCD cameras is captured in an interlaced 4:2:2 interleaved format. Certain profiles of video-compression standards, however, accept input only in a progressively scanned 4:2:0 format. In this case, interlacing artifacts must be removed since interlaced video content can be quite challenging for progressive encoders.
Engineers have a number of sophisticated de-interlacing algorithms to choose from, but not all applications require the highest level of video quality. Moreover, sophisticated algorithms tend to be compute-intensive, and developers always have a digital-signal-processor (DSP) MIPS budget to manage.
When the application doesn't require the highest video quality, resizing algorithms in hardware can be used for de-interlacing. This technique is particularly useful to save precious DSP MIPS by offloading the 4:2:2 to 4:2:0 conversion and de-interlacing operations to other hardware. Surprisingly, resizing hardware sometimes achieves de-interlacing quality on par with high-complexity de-interlacing algorithms after video compression is taken into account.
The simple method described in this article can be used to deinterlace video applications. This technique works best when there's a significant amount of motion in the video data frames, since still images tend to highlight the deficiencies.
LUMINANCE AND CHROMINANCE CODING
NTSC defines standard-definition (NTSC SD) resolution as 720 pixels per row, 480 pixels per column, and 30 frames per second. The information for each pixel contains three components:
• Y is the luminance (luma) information
• Cb (U) is the blue color information
• Cr (V) is the red color information
When the NTSC standard was adopted, engineers faced both transmission bandwidth and computing power constraints for encoding video streams. Since the human eye is far more sensitive to luminance information, the NTSC standard lightened the load by calling for the chrominance information to be horizontally down-sampled by half.
Each captured frame from a CCD camera has 720 by 480 Y values, 360 by 480 U values, and 360 by 480 V values. Each value is eight bits (a byte) in range [0, 255], which makes each NTSC SD frame (720 + 360 + 360) X 480 = 691,200 bytes.
The Y/U/V components in the captured frame are typically interleaved, usually in YUV 4:2:2 format. There are two ways to organize the data, but in the interest of simplicity, assume the data is organized in UYVY interleaved 4:2:2 format (Fig. 1).
As previously mentioned, most encoders require input video to be in YUV 4:2:0 format. There are two distinctions between 4:2:2 interleaved data and 4:2:0 planar data.
In 4:2:0 format, chroma information is further down-sampled vertically by half as well. That is, for each NTSC SD frame, U or V components each contain 360 X 240 bytes instead of 360 480 bytes. Each NTSC SD frame in 4:2:0 format is 518,400 bytes [(720 X 480) + (360 X 240 X 2)]. The additional chroma down-sampling is required to balance real-time performance with acceptable picture quality.
Efficient implementations of video-compression standards also often require that luma and chroma components are separated in memory, because the encoding algorithms may process them in different ways. Figure 2 shows the NTSC SD video frame in 4:2:0 planar format.
INTERLACED ARTIFACTS
Interlace scanning involves scanning a picture twice, with one scan capturing every even line and one scan capturing every odd line. The two captures are separated by a small difference in time and then merged together to form a complete frame.