Implementing The Controller
To implement the memory controller, the design can be divided into several sections. Look at the clock-generation portion (Fig. 7). In about a dozen lines of code, a relatively straightforward VHDL description can be written of the clock signal that's to phase lock the internal FPGA clock to the system clock and generate a 2X system clock. Similarly, the QDR SRAM interface contains well under a dozen lines of code to implement the DDR interface (Fig. 8a). The system interface is a straightforward setup of the register read, write, and section operations (Fig. 8b).
When synthesized, the resulting logic diagram of the complete controller shows the memory interface with its three 18-bit buses and the host interface with the dual 36-bit data buses and 18-bit address buses (Fig. 9). The FPGA operates internally at 200 MHz. Externally, the buses need only operate at 100 MHz, because the DDR interface transfers data on both the leading and trailing edges. The 36-bit read data path from the host is internally split into two 18-bit sections and latched by separate registers. These registers are clocked at 200 MHz, allowing one to send or receive data on both edges of the clock.
The four on-chip DLLs available on the Spartan-II FPGA family can deskew either the internal global clock network or clocks fed off-chip to other system components. The two DLLs shown permit the controller to achieve zero clock skew between the FPGA's on-chip clock and the QDR SRAM clock.
While working with double- and quadruple-data-rate memory devices, it's important to know the availability of a double-frequency clock that's phase-locked to the system clock. Used in conjunction with the on-chip global clock-distribution network, high-speed synchronous I/O resources, and Select-I/O flexible signaling standards, the FPGA-to-QDR SRAM interface achieves a data throughput of 7200 Mbits/s.
The most difficult task for this design is to meet the timing requirements for the QDR_2. All QDR_2 signals are registered in the I/O buffers and use HSTL buffers. For the write cycle, timing all signals must meet those setup-and-hold-time requirements. That means dealing with the sum of propagation delays from the Spartan FPGA (clock to output), the board-wiring delay, and the QDR memory setup time. Those delays must total less than the cycle time of the write operation:
FPGA Tco (2.5 ns) + board Tpd (0.6 ns) + QDR SRAM Tsu (0.8 ns)
The clock-to-out and QDR setup-time values are 2.5 and 0.8 ns, respectively. Consequently, there's a good margin for board delay. The QDR memory has a hold-time requirement of 0.5 ns.
During the read cycle, data must meet the setup-and-hold time of the FPGA:
QDR SRAM Tco (2.5 ns) + board Tpd (0.6 ns) + Spartan-II Tsu (1.55 ns)
The setup-time requirement for the Spartan-II is 1.55 ns. Along with a clock-to-out timing on the QDR SRAM of 2.5 ns, this demand permits a good margin for operation at 100 MHz.
To implement the controller in the FPGA requires two DLLs, two global clock buffers, and 119 I/O buffers. The design can be verified with a back-annotated simulation at 100 MHz. By using a faster Spartan-II FPGA, the interface performance can be improved even further.
Implementing The Controller
To implement the memory controller, the design can be divided into several sections. Look at the clock-generation portion (Fig. 7). In about a dozen lines of code, a relatively straightforward VHDL description can be written of the clock signal that's to phase lock the internal FPGA clock to the system clock and generate a 2X system clock. Similarly, the QDR SRAM interface contains well under a dozen lines of code to implement the DDR interface (Fig. 8a). The system interface is a straightforward setup of the register read, write, and section operations (Fig. 8b).
When synthesized, the resulting logic diagram of the complete controller shows the memory interface with its three 18-bit buses and the host interface with the dual 36-bit data buses and 18-bit address buses (Fig. 9). The FPGA operates internally at 200 MHz. Externally, the buses need only operate at 100 MHz, because the DDR interface transfers data on both the leading and trailing edges. The 36-bit read data path from the host is internally split into two 18-bit sections and latched by separate registers. These registers are clocked at 200 MHz, allowing one to send or receive data on both edges of the clock.
The four on-chip DLLs available on the Spartan-II FPGA family can deskew either the internal global clock network or clocks fed off-chip to other system components. The two DLLs shown permit the controller to achieve zero clock skew between the FPGA's on-chip clock and the QDR SRAM clock.
While working with double- and quadruple-data-rate memory devices, it's important to know the availability of a double-frequency clock that's phase-locked to the system clock. Used in conjunction with the on-chip global clock-distribution network, high-speed synchronous I/O resources, and Select-I/O flexible signaling standards, the FPGA-to-QDR SRAM interface achieves a data throughput of 7200 Mbits/s.
The most difficult task for this design is to meet the timing requirements for the QDR_2. All QDR_2 signals are registered in the I/O buffers and use HSTL buffers. For the write cycle, timing all signals must meet those setup-and-hold-time requirements. That means dealing with the sum of propagation delays from the Spartan FPGA (clock to output), the board-wiring delay, and the QDR memory setup time. Those delays must total less than the cycle time of the write operation:
FPGA Tco (2.5 ns) + board Tpd (0.6 ns) + QDR SRAM Tsu (0.8 ns)
The clock-to-out and QDR setup-time values are 2.5 and 0.8 ns, respectively. Consequently, there's a good margin for board delay. The QDR memory has a hold-time requirement of 0.5 ns.
During the read cycle, data must meet the setup-and-hold time of the FPGA:
QDR SRAM Tco (2.5 ns) + board Tpd (0.6 ns) + Spartan-II Tsu (1.55 ns)
The setup-time requirement for the Spartan-II is 1.55 ns. Along with a clock-to-out timing on the QDR SRAM of 2.5 ns, this demand permits a good margin for operation at 100 MHz.
To implement the controller in the FPGA requires two DLLs, two global clock buffers, and 119 I/O buffers. The design can be verified with a back-annotated simulation at 100 MHz. By using a faster Spartan-II FPGA, the interface performance can be improved even further.