The interconnect should minimize the funneling effect of going from
the neighborhood connections to the inter-board connections. This
implies choosing a high bandwidth per connection, and using very high-speed
FIFOs to decouple the two bus speeds. A DMA engine at the interface
to the interconnect will prevent a fast processor from waiting for
the interconnect transfer to complete.
Real-World Example
How does all this come together in the real world? Let's examine
the high-performance memory architecture for DSP applications employed
in the Excalibur PowerPC daughtercard from SKY Computers. On the daughtercard,
each of four PowerPC processors is connected to its own local SDRAM
by an 83.3-MHz interconnect (Fig. 2, again). The memory
controllers also are connected to each other at the same 83.3-MHz
rate, so any processor can access any memory on the daughtercard at
the same raw bandwidth.
PowerPC processors were chosen primarily for the sustained throughput
of their external interface, including the raw bandwidth and the pipelining
capabilities. Different PowerPC processors have different pipelining
capabilities, as well as being available with different maximum CPU
core frequencies. The Excalibur card can be implemented with different
processors to take advantage of those various combinations as they
change over time.
The current top performer is the 333-MHz PowerPC 604e. It allows
three memory accesses to be pipelined, enabling single-cycle throughput
for subsequent loads from the same cache line. The 604e also has a
feature called "streaming," which allows subsequent cache line loads
to occur without a gap. The result: a sustained memory access pattern
for data reads of "... 1111 1111 1111 ..." Therefore, the sustained
performance in this case is the same as the theoretical peak performance.
The memory controllers on Excalibur also can access the interface
to the SKYchannel interconnect. The ANSI-standard SKYchannel Packet
Bus (ANSI/VITA 10-1995) provides a 320-Mbyte/s connection to all the
other daughtercards in the system. Multichassis systems as large as
4096 units are possible using SKYchannel (Fig. 3). Achieving
the highest performance for multiprocessor DSP applications thus requires
a global optimization of processors, memory technology, controller
design, and multiprocessor interconnect.
When SDRAMs are combined with processors that support pipelining,
they can provide sustained memory accesses at 667 Mbytes/s, while
providing hundreds of megabytes of storage capacity. This combination
of speed and capacity is critical for DSP applications with large
data sets.
The interconnect should minimize the funneling effect of going from
the neighborhood connections to the inter-board connections. This
implies choosing a high bandwidth per connection, and using very high-speed
FIFOs to decouple the two bus speeds. A DMA engine at the interface
to the interconnect will prevent a fast processor from waiting for
the interconnect transfer to complete.
Real-World Example
How does all this come together in the real world? Let's examine
the high-performance memory architecture for DSP applications employed
in the Excalibur PowerPC daughtercard from SKY Computers. On the daughtercard,
each of four PowerPC processors is connected to its own local SDRAM
by an 83.3-MHz interconnect (Fig. 2, again). The memory
controllers also are connected to each other at the same 83.3-MHz
rate, so any processor can access any memory on the daughtercard at
the same raw bandwidth.
PowerPC processors were chosen primarily for the sustained throughput
of their external interface, including the raw bandwidth and the pipelining
capabilities. Different PowerPC processors have different pipelining
capabilities, as well as being available with different maximum CPU
core frequencies. The Excalibur card can be implemented with different
processors to take advantage of those various combinations as they
change over time.
The current top performer is the 333-MHz PowerPC 604e. It allows
three memory accesses to be pipelined, enabling single-cycle throughput
for subsequent loads from the same cache line. The 604e also has a
feature called "streaming," which allows subsequent cache line loads
to occur without a gap. The result: a sustained memory access pattern
for data reads of "... 1111 1111 1111 ..." Therefore, the sustained
performance in this case is the same as the theoretical peak performance.
The memory controllers on Excalibur also can access the interface
to the SKYchannel interconnect. The ANSI-standard SKYchannel Packet
Bus (ANSI/VITA 10-1995) provides a 320-Mbyte/s connection to all the
other daughtercards in the system. Multichassis systems as large as
4096 units are possible using SKYchannel (Fig. 3). Achieving
the highest performance for multiprocessor DSP applications thus requires
a global optimization of processors, memory technology, controller
design, and multiprocessor interconnect.
When SDRAMs are combined with processors that support pipelining,
they can provide sustained memory accesses at 667 Mbytes/s, while
providing hundreds of megabytes of storage capacity. This combination
of speed and capacity is critical for DSP applications with large
data sets.