PROCESS AND CIRCUIT TECHNOLOGY
The semiconductor industry constantly battles the evolving challenges of small process dimensions through huge investments in equipment, process technologies, design tools, and circuit techniques. In particular, the challenge of increasing leakage power with small process geometries is felt across the industry. Thus, many well-known technologies at the 65-nm process node (and prior) are used to maintain or increase performance while managing leakage power:
- Copper routing
- Low-k dielectric
- Multi-threshold transistors
- Variable gate-length transistors
- Triple gate oxide
- Super-thin gate oxide
- Strained silicon
LOWEST POWER, HIGHEST PERFORMANCE
To attain high efficiency and performance, Stratix III FPGAs leverage an adaptive-logic-module (ALM) logic architecture and a MultiTrack interconnect fabric. This combination allows more logic to be packed with less routing.
ALM technology, which is said to implement 80% more logic functions than other architectures, includes an eight-input fracturable lookup table (LUT), two 2-bit adders, and two registers.
MultiTrack interconnect provides onehop interconnectivity between different LABs and can be measured by the number of "hops" required to get from one LAB to another. Adding interconnect hops increases capacitance; the fewer the hops, the less high-speed logic is required to meet performance. MultiTrack interconnect provides one-hop interconnectivity that yields the lowest possible power (Fig. 7).
Hierarchical clocking is used in the Stratix III FPGAs to support up to 360 unique clocks. The propagation of every clock network can be controlled down to a LAB level. Logic with common clocks is grouped into LABs. Clocks are only propagated where the logic uses that clock. All other clock signals are shut down to minimize power consumption.
MEMORY INTERFACE POWER SAVINGS
Double-data-rate (DDR) memory interfaces are one of the most common I/O interfaces in designs today, and they can be fairly power-hungry. To combat those power issues, designers can turn to dynamic on-chip termination and DDR3.
When reading and writing to external memory, it's vital to have an impedance-matched buffer, both in series and parallel termination. If there's a 50-Ω transition line when writing to memory, a matched buffer with a series impedance of 50 Ω is needed. When receiving data from the memory, a 50-Ω parallel termination resistor pulled to a termination voltage is desired. Not only is this used for DDR-type interfaces, but also for RLDRAM and QDRRAM.
By supporting dynamic on-chip termination, FPGA designers can turn the parallel termination resistor to an on or off (open circuit) state, depending on whether a read or write is being executed. During a write, the FPGA output driver impedance must be matched to the transmission line. However, the parallel resistor to VTT wastes energy and reduces signal swing. To avoid this, the resistor can be turned off (Fig. 8).
During a read, the parallel resistor is on to terminate the transmission line to reduce reflections that degrade signal integrity and the ability to reliably read data.
The significant benefits of dynamic onchip termination are realized whenever the bus is either performing a write from the FPGA or the bus is idle. First, power is greatly reduced—1.6 W of static power can be saved on a 72-bit DDR2 bus. In addition, a pure series line termination is achieved when writing. Finally, the need for a large number of board termination resistors is removed, saving board cost and complexity.
DDR3 provides 30% lower power than DDR2 because it runs at a lower voltage: 1.5 V versus 1.8 V. For example, a system with a 72-pin, 200-MHz or 400Mbit/s memory interface with on-chip termination would dissipate 3.9 W for only one memory interface. Using dynamic on-chip termination (wherein the parallel termination resistor is turned off when idle or when performing a write) can save 1.6 W. If both DDR3 and dynamic on-chip termination are used, power drops to 1.6 W, saving a total of 2.3 W. These savings are on a per interface basis (i.e., four memory interfaces in an FPGA would save 9.2 W).
The move to very small process nodes—65-nm and below— delivers the expected Moore's Law benefits of increased density and performance. However, the boost in performance results in huge increases in power consumption, introducing the risk of consuming unacceptable amounts of power.
If power-reduction strategies aren't used, static power consumption will increase significantly. Also, without a specific power optimization effort, dynamic power consumption rises due to the increased logic capacity and higher switching frequencies.
Overcoming these power challenges with an enabling and innovative architecture, combined with process technology and circuit techniques advances, provides an efficient and scalable solution for today's increasingly complex FPGA-based designs.