PROCESS AND CIRCUIT TECHNOLOGY
The semiconductor industry constantly battles the evolving challenges of small process dimensions
through huge investments in equipment,
process technologies, design tools, and
circuit techniques. In particular, the challenge of increasing leakage power with
small process geometries is felt across
the industry. Thus, many well-known
technologies at the 65-nm process node
(and prior) are used to maintain or
increase performance while managing
leakage power:
- Copper routing
- Low-k dielectric
- Multi-threshold transistors
- Variable gate-length transistors
- Triple gate oxide
- Super-thin gate oxide
- Strained silicon
LOWEST POWER, HIGHEST PERFORMANCE
To attain high efficiency and performance, Stratix III FPGAs
leverage an adaptive-logic-module
(ALM) logic architecture and a MultiTrack interconnect fabric. This combination allows more logic to be packed with
less routing.
ALM technology, which is said to implement 80% more logic functions than other architectures, includes an eight-input
fracturable lookup table (LUT), two 2-bit
adders, and two registers.
MultiTrack interconnect provides onehop interconnectivity between different
LABs and can be measured by the number of "hops" required to get from one
LAB to another. Adding interconnect
hops increases capacitance; the fewer
the hops, the less high-speed logic is
required to meet performance. MultiTrack interconnect provides one-hop
interconnectivity that yields the lowest
possible power (Fig. 7).
Hierarchical clocking is used in the
Stratix III FPGAs to support up to 360
unique clocks. The propagation of every
clock network can be controlled down to
a LAB level. Logic with common clocks is
grouped into LABs. Clocks are only propagated where the logic uses that clock.
All other clock signals are shut down to
minimize power consumption.
MEMORY INTERFACE POWER
SAVINGS
Double-data-rate (DDR) memory interfaces are one of the most
common I/O interfaces in designs today,
and they can be fairly power-hungry. To
combat those power issues, designers
can turn to dynamic on-chip termination
and DDR3.
When reading and writing to external
memory, it's vital to have an impedance-matched buffer, both in series and parallel termination. If there's a 50-Ω transition line when writing to memory, a
matched buffer with a series impedance
of 50 Ω is needed. When receiving data
from the memory, a 50-Ω parallel termination resistor pulled to a termination
voltage is desired. Not only is this used
for DDR-type interfaces, but also for
RLDRAM and QDRRAM.
By supporting dynamic on-chip termination, FPGA designers can turn the parallel
termination resistor to an on or off (open
circuit) state, depending on whether a
read or write is being executed. During a
write, the FPGA output driver impedance
must be matched to the transmission
line. However, the parallel resistor to VTT wastes energy and reduces signal swing. To avoid this, the resistor can be turned
off (Fig. 8).
During a read, the parallel resistor is
on to terminate the transmission line
to reduce reflections that degrade signal integrity and the ability to reliably
read data.
The significant benefits of dynamic onchip termination are realized whenever
the bus is either performing a write from
the FPGA or the bus is idle. First, power
is greatly reduced—1.6 W of static power
can be saved on a 72-bit DDR2 bus. In
addition, a pure series line termination is
achieved when writing. Finally, the need
for a large number of board termination
resistors is removed, saving board cost
and complexity.
DDR3 provides 30% lower power than
DDR2 because it runs at a lower voltage: 1.5 V versus 1.8 V. For example, a system with a 72-pin, 200-MHz or 400Mbit/s memory interface with on-chip
termination would dissipate 3.9 W for
only one memory interface. Using
dynamic on-chip termination (wherein
the parallel termination resistor is turned off when idle or when performing a write) can save 1.6 W. If both DDR3 and dynamic on-chip termination are used,
power drops to 1.6 W, saving a total of 2.3 W. These savings
are on a per interface basis (i.e., four memory interfaces in an
FPGA would save 9.2 W).
The move to very small process nodes—65-nm and below—
delivers the expected Moore's Law benefits of increased density and performance. However, the boost in performance results
in huge increases in power consumption, introducing the risk
of consuming unacceptable amounts of power.
If power-reduction strategies aren't used, static power consumption will increase significantly. Also, without a specific power optimization effort, dynamic power consumption rises due to
the increased logic capacity and higher switching frequencies.
Overcoming these power challenges with an enabling and innovative architecture, combined with process technology and circuit
techniques advances, provides an efficient and scalable solution
for today's increasingly complex FPGA-based designs.