STREAMLINED PROGRAMMING
Developers program the Storm-1 lanes
using C. The original work was done using
C++, but it was discarded in lieu of C,
which provided a more elegant and efficient solution because it matched the way
many stream processing applications
were designed.
One set of C functions, called kernel
functions, runs on the lanes. These functions are used as necessary and process
data in parallel in each lane regardless of
how many lanes are involved. Limits are
based on the physical number of lanes
and the data loaded into the lanes.
If only one lane is needed, only one will
operate. The others can idle, conserving
power. The eight-lane version consumes
about half the power of the 16-lane version when all lanes are operational. Running fewer lanes at a higher speed is more
efficient than running more lanes at a
slower speed.
Kernel functions operate only on local
lane data. They're used after the stream
data has been moved into the lane's memory. One kernel function will be applied to
all lanes at a single time. Kernel function
execution can be conditional on a per lane
basis. Libraries of kernel functions are
available for common transformation and
processing requirements.
Kernel functions don't depend on the
number of lanes involved, so the architecture can be scaled up and down. This may
lead to additional chips in the family or
architectures that use multiple chips. In
this case, the code to handle the lanes
will be replicated but remain the same
from chip to chip. Communication
between lanes in different chips will be significantly more expensive, but this
won't affect many applications.
The RapiDev Development Environment supports Storm-1. It includes the
SPC compiler for Linux and Windows
hosts and the cycle-accurate Target Code
Simulator (TCS), which includes MIPSsim
for the control processors. The Eclipse IDE
ties everything together, including the simulator and VLIW profiler support.
Image processing, DSP, and general
math libraries are included. The MIPS
processors run Linux and can be programmed using any conventional set of
programming tools. Libraries are provided
for managing and load-balancing the
memory, streams, and lanes.
Available individually, the SP16-G160
costs $99, and the SP8-G80 costs $59. A
PCI board is available with a 16-lane version. The board has a Gigabit Ethernet
interface, analog audio in/out, 512
Mbytes of SDRAM, and 32 Mbytes of
flash. It can operate in standalone mode
or be controlled by a host processor.
The Storm-1 architecture is just one of
many. Architectures such as IBM's Cell
processor or even symmetrical multiprocessing (SMP) systems will remain important in their niches using different parallel
programming tools and techniques.
Stream Processors
www.streamprocessors.com
Storm-1
Versions: eight-lane SP8-G80 and
16-lane SP16-G160
Speed: 500 MHz
Memory: 128-bit DDR2
Stream I/O pins: 72 or 108
programmable pins, 165 MHz
Peripherals: 1-Gbit Ethernet,
serial, 32-bit, 66-MHz PCI
Package: 31- by 31-mm 896-pin
plastic ball-grid array (PBGA)