Premium Content

New Signal Chain Resources from Texas Instruments:

Optimize Memory Subsystem For Top Performance

A Better Understanding Of Memory Accesses Allows DSP Memory Subsystems To Be Better Matched To The DSP Chips.

Date Posted: May 25, 1998 12:00 AM

The interconnect should minimize the funneling effect of going from the neighborhood connections to the inter-board connections. This implies choosing a high bandwidth per connection, and using very high-speed FIFOs to decouple the two bus speeds. A DMA engine at the interface to the interconnect will prevent a fast processor from waiting for the interconnect transfer to complete.

Real-World Example
How does all this come together in the real world? Let's examine the high-performance memory architecture for DSP applications employed in the Excalibur PowerPC daughtercard from SKY Computers. On the daughtercard, each of four PowerPC processors is connected to its own local SDRAM by an 83.3-MHz interconnect (Fig. 2, again). The memory controllers also are connected to each other at the same 83.3-MHz rate, so any processor can access any memory on the daughtercard at the same raw bandwidth.

PowerPC processors were chosen primarily for the sustained throughput of their external interface, including the raw bandwidth and the pipelining capabilities. Different PowerPC processors have different pipelining capabilities, as well as being available with different maximum CPU core frequencies. The Excalibur card can be implemented with different processors to take advantage of those various combinations as they change over time.

The current top performer is the 333-MHz PowerPC 604e. It allows three memory accesses to be pipelined, enabling single-cycle throughput for subsequent loads from the same cache line. The 604e also has a feature called "streaming," which allows subsequent cache line loads to occur without a gap. The result: a sustained memory access pattern for data reads of "... 1111 1111 1111 ..." Therefore, the sustained performance in this case is the same as the theoretical peak performance.

The memory controllers on Excalibur also can access the interface to the SKYchannel interconnect. The ANSI-standard SKYchannel Packet Bus (ANSI/VITA 10-1995) provides a 320-Mbyte/s connection to all the other daughtercards in the system. Multichassis systems as large as 4096 units are possible using SKYchannel (Fig. 3). Achieving the highest performance for multiprocessor DSP applications thus requires a global optimization of processors, memory technology, controller design, and multiprocessor interconnect.

When SDRAMs are combined with processors that support pipelining, they can provide sustained memory accesses at 667 Mbytes/s, while providing hundreds of megabytes of storage capacity. This combination of speed and capacity is critical for DSP applications with large data sets.

The interconnect should minimize the funneling effect of going from the neighborhood connections to the inter-board connections. This implies choosing a high bandwidth per connection, and using very high-speed FIFOs to decouple the two bus speeds. A DMA engine at the interface to the interconnect will prevent a fast processor from waiting for the interconnect transfer to complete.

Real-World Example
How does all this come together in the real world? Let's examine the high-performance memory architecture for DSP applications employed in the Excalibur PowerPC daughtercard from SKY Computers. On the daughtercard, each of four PowerPC processors is connected to its own local SDRAM by an 83.3-MHz interconnect (Fig. 2, again). The memory controllers also are connected to each other at the same 83.3-MHz rate, so any processor can access any memory on the daughtercard at the same raw bandwidth.

PowerPC processors were chosen primarily for the sustained throughput of their external interface, including the raw bandwidth and the pipelining capabilities. Different PowerPC processors have different pipelining capabilities, as well as being available with different maximum CPU core frequencies. The Excalibur card can be implemented with different processors to take advantage of those various combinations as they change over time.

The current top performer is the 333-MHz PowerPC 604e. It allows three memory accesses to be pipelined, enabling single-cycle throughput for subsequent loads from the same cache line. The 604e also has a feature called "streaming," which allows subsequent cache line loads to occur without a gap. The result: a sustained memory access pattern for data reads of "... 1111 1111 1111 ..." Therefore, the sustained performance in this case is the same as the theoretical peak performance.

The memory controllers on Excalibur also can access the interface to the SKYchannel interconnect. The ANSI-standard SKYchannel Packet Bus (ANSI/VITA 10-1995) provides a 320-Mbyte/s connection to all the other daughtercards in the system. Multichassis systems as large as 4096 units are possible using SKYchannel (Fig. 3). Achieving the highest performance for multiprocessor DSP applications thus requires a global optimization of processors, memory technology, controller design, and multiprocessor interconnect.

When SDRAMs are combined with processors that support pipelining, they can provide sustained memory accesses at 667 Mbytes/s, while providing hundreds of megabytes of storage capacity. This combination of speed and capacity is critical for DSP applications with large data sets.

Part Inventory
Go
powered by:
 

 
You must log on before posting a comment.

Are you a new visitor? Register Here
    There are no comments to display. Be the first one!