[Leapfrog: First Look]
Architecture Maps DSP Flow To Parallel Processing Platform
This divide-and-conquer parallel processing approach whips up a Storm.
STREAMLINED PROGRAMMING Developers program the Storm-1 lanes using C. The original work was done using C++, but it was discarded in lieu of C, which provided a more elegant and efficient solution because it matched the way many stream processing applications were designed.
One set of C functions, called kernel functions, runs on the lanes. These functions are used as necessary and process data in parallel in each lane regardless of how many lanes are involved. Limits are based on the physical number of lanes and the data loaded into the lanes.
If only one lane is needed, only one will operate. The others can idle, conserving power. The eight-lane version consumes about half the power of the 16-lane version when all lanes are operational. Running fewer lanes at a higher speed is more efficient than running more lanes at a slower speed.
Kernel functions operate only on local lane data. They're used after the stream data has been moved into the lane's memory. One kernel function will be applied to all lanes at a single time. Kernel function execution can be conditional on a per lane basis. Libraries of kernel functions are available for common transformation and processing requirements.
Kernel functions don't depend on the number of lanes involved, so the architecture can be scaled up and down. This may lead to additional chips in the family or architectures that use multiple chips. In this case, the code to handle the lanes will be replicated but remain the same from chip to chip. Communication between lanes in different chips will be significantly more expensive, but this won't affect many applications.
The RapiDev Development Environment supports Storm-1. It includes the SPC compiler for Linux and Windows hosts and the cycle-accurate Target Code Simulator (TCS), which includes MIPSsim for the control processors. The Eclipse IDE ties everything together, including the simulator and VLIW profiler support.
Image processing, DSP, and general math libraries are included. The MIPS processors run Linux and can be programmed using any conventional set of programming tools. Libraries are provided for managing and load-balancing the memory, streams, and lanes.
Available individually, the SP16-G160 costs $99, and the SP8-G80 costs $59. A PCI board is available with a 16-lane version. The board has a Gigabit Ethernet interface, analog audio in/out, 512 Mbytes of SDRAM, and 32 Mbytes of flash. It can operate in standalone mode or be controlled by a host processor.
The Storm-1 architecture is just one of many. Architectures such as IBM's Cell processor or even symmetrical multiprocessing (SMP) systems will remain important in their niches using different parallel programming tools and techniques.