As handheld digital assistants and Internet appliances are asked to handle ever more complex tasks, designers face critical tradeoffs between cost, performance, power, and size. Additionally, displays on such devices are typically limited to simple 2D graphics and alphanumeric data, although some now have color and limited video display capability. All the desired functionality can be achieved with off-the-shelf microprocessors, memories, and peripheral support chips. But such circuits don't solve the other tradeoff issues. Standard circuits usually lack the high level of integration and lower power consumption to make the handheld device practical.
PDAs and Internet appliances are also doing more than allowing users to access data or address books. Some of the largest growth areas now include visual communications and entertainment. For example, the recently released Kodak MC3 combines the features of a digital still camera, a digital camcorder, and an MP3 player, qualifying it as a portable multimedia device. Likewise, the Cybiko handheld device combines a game similar to the Nintendo Gameboy, a PDA data organizer, and two-way paging, making it a wireless "intertainment" (Internet-based entertainment) system.
For such devices, higher-performance graphics and greater video support must be embedded to handle graphics and video data streams. At the same time, the higher-performance engines can't consume more power. On the contrary, they must consume even less power than previous alternatives. For low operating power levels, designers can incorporate gated clocking to turn off blocks not in use for a particular operation, and software-controlled scalable clocks to permit on-the-fly adjustments to system operating speed.
NeoMagic's designers have noticed this trend and the accompanying demanding system requirements. By leveraging their expertise in combining DRAM and graphics/video logic with analog circuitry, they developed the Mobile Internet Magic (MiMagic) family of single-chip solutions for handheld internet appliances. Initially, these system-on-a-chip (SoC) solutions will come in two versionsthe NMS7040 for high-performance 2D graphics, and the NMS7041 for eye-popping 3D graphics. Both will include a 128-bit bit-block-transfer (bitBLT) graphics engine and 4 Mbytes of embedded DRAM.
The company also plans to expand the product family with versions that pack larger amounts of DRAM and/or a highly parallel compute block, the associative processor array (APA). The compute array will permit the parallel processing of thousands of pixels, greatly accelerating imaging and data computations. Thousands of compute elements comprise the APA block, each containing storage and processing functions. The processing elements take advantage of content-addressable memory to access like data very quickly. This accelerates time-critical operations such as table lookups, address matching, and pattern matchingkey operations in networking and multimedia applications.
Such a computational array promises to greatly accelerate the playback of media streams and handle various image-processing functions with ease. For example, when clocked at 66 MHz, the APA array could perform 21.6 G additions (8-bit)/s, or 2.6 G 3-by-3 low-pass filter operations/s, or 135.2 million 8-by-8 discrete cosine transforms/s. Such performance levels are four to 10 times higher than some of the best standard DSP chips available.
The First Solutions
Of course, providing a system solution entails a lot more than just combining logic and memory on a chip. Designers at NeoMagic first examined typical system architectures to determine the essential functions and performance requirements that a single-chip solution must deliver.
In a typical Internet appliance, that means starting with a moderate- to high-performance CPU, a high-performance graphics/media engine with integrated graphics memory, an LCD screen controller (very often with touch-screen/pen-input capability), an interface to external SDRAM, flash, SRAM, or ROM (up to 512 Mbytes), and support for audio and video I/O. The chip should also include interface support for USB, Bluetooth, game controls, infrared, and other I/O devices.
To control these features, NeoMagic combined a memory-mapped architecture centered around a 32-bit MIPS 4Kc RISC CPU core. It contains 16-kbyte data and instruction caches, plus a single-cycle 32- by 16-bit multiplier-accumulator. Next, the designers integrated in the high-performance 2D or 3D graphics engine and 4 Mbytes of embedded DRAM to support the graphics and application software (Fig. 1). Without the big multiwatt power consumption, these highly integrated PDA engines deliver graphics performance comparable to that of today's mainstream desktop computers (Fig. 2).
This combination of high performance and low power comes from NeoMagic's ability to cointegrate the DRAM and the graphics logic. Transferring data over a 256-bit wide bus that links the embedded DRAM to the graphics engine achieves extremely high performance. The bus permits peak data transfer rates of 3.2 Gbytes/s without the typical high power consumption encountered when implementing the graphics subsystem with discrete SDRAMs and a graphics controller chip.
Additionally, a DMA controller helps keep data transfer speeds at maximum by handling transfers from external memory devices to the embedded DRAM. The controller manages 3D texture transfers from external memory to the internal 3D texture FIFO and moves 3D vertex data from the embedded DRAM to the 3D vertex FIFO.
The 2D graphics engine supports 128-bit bitBLT operations, color expansion, x-y coordinate addressing, rectangle clipping, patterning, and integrated raster operations. Also, the engine contains a hardware cursor and hardware icon memory. When running at its peak throughput, the engine delivers 800 million 16-bit pixels/s.