To get higher performance than the original Cyclone family, Cyclone II designers increased logic-array-block (LAB) grouping size from 10 LEs to 16 per block. This helped shrink chip area and boost performance, because larger functions can be configured in the LAB.
The just-released SRAM-based EC and ECP families from Lattice Semiconductor also aim to replace ASICs. Available without dedicated DSP support, the EC series devices are basically a subset of the ECP series, which includes from four to 10 dedicated DSP blocks. Each block can implement up to eight 9-bit, four 18-bit, or one 36-bit multiplier (for full family details, see "FPGAs Bring Custom-ASICs Economy To System Design," electronic design, Aug. 9, p. 40).
Lattice also breathed some new life into a mature FPGA family, namely the ORCA. (It acquired ORCA from the company now known as Agere.) The recently released ORCA ORL1I10G is an FPGA packing from 333 to 643 system gates and 111 kbits of RAM. Its high-speed dedicated I/O ports are OIF-standard-compliant and can handle from 10 to 12.5 Gbits/s using a 16-bit low-voltage differential-signaling interface operating at 850 Mbits/s. There are also four 2.5-Gbit/s interfaces, each with a separate clock to synchronize the transfer of data to the FPGA logic.
INSTANT GRATIFICATION
For those applications that can't tolerate delays on power-up, on-chip configuration storage lets a chip start functioning in microseconds. This is in contrast with the tens or hundreds of milliseconds required by a SRAM-based FPGA to load its data from an external flash memory or host system. These same features also hold true for all flash-based programmable devices, such as Actel's ProASIC and ProASIC Plus families, Lattice's ispXPGA family, QuickLogic's Eclipse and other families, and Altera's Max II series of flash-based CPLDs.
A bit more conservative on the gate count than the Altera and Xilinx SRAM-based families, the Actel Axcelerator family is manufactured on a 150-nm antifuse process. Chips in the family feature a gate capacity that ranges from 125k to 2 million system gates, with a typical actual usable number of about 82k to 1 million. However, the logic performance is right near the head of the class, with internal operating speeds that can exceed 500 MHz and system speeds reaching 350 MHz.
The antifuse technology used by Actel (and QuickLogic) makes the FPGAs one-time programmable. So once they're configured, the logic can't be altered. Although this rules out system updates after the devices are fielded, it does provide other benefits. For instance, the nonvolatile configuration pattern is stored on-chip. Thus, no off-chip flash memory device is needed to store the configuration pattern that's loaded during power-up. And, because the data is on-chip, it's safe from the prying eyes of anyone attempting to reverse-engineer the configuration.
The Axcelerator family, though, uses a different large-grain cell approach than that used by either Altera or Xilinx. Rather than use the basic SRAM-based lookup table, designers at Actel developed a block they call a "SuperCluster." It contains multiple combinatorial logic and register modules and transmit and receive routing buffers (Fig. 3). The basic architecture is an enhancement of the company's SX-A sea-of-modules architecture. Its logic fabric covers the chip within the pad ring. Virtually no chip area is lost to interconnect elements or routing since the antifuses lie between the metal layers above the silicon.
The SuperCluster includes two types of logic modulesa register cell (R-cell) and a combinatorial cell (C-cell). The C-cell, which can implement more than 4000 combinatorial functions of up to five inputs, includes carry logic to more efficiently implement arithmetic functions. The R-cell packs a flip-flop with asynchronous preset, active-low enable control signals, and programmable clock polarity. The clock source for the cell can be chosen from hardwired clocks, routed clocks, or internal logic.
Two C-cells, a single R-cell, two transmit buffers, and two receive routing buffers form a cluster. Two clusters form a SuperCluster. One additional independent buffer provides extra buffering on high-fanout nets. The AX architecture is fully fracturable, which means that if a particular signal path uses one or more of the logic modules in the SuperCluster, other signal paths can still use the other logic modules.
Though not packing quite as many equivalent ASIC gates, the smaller basic LE (finer-granularity) of Actel's ProASIC flash-based FPGA family puts that family in the high-gate-count race as well.
Flash memory on the same chip as the SRAM-based configurable logic holds the key to instant-on operation for both the ispXPGA family from Lattice and the MAX II family of CPLDs from Altera. Each family basically consists of SRAM-based FPGAs. But rather than use an off-chip flash memory to hold the configuration data, designers incorporated the memory on-chip. What that translates into is a very fast power-on configuration capability.
The ispXPGA family includes four devices with complexities ranging from 139k to 1.25 Mgates, 92 kbits to 414 kbits of dedicated RAM, 30k to 246k bits of distributed memory, and 160 to 496 I/O pads. The configurable logic is grouped in blocks called programmable function units. Each unit contains four quad-input lookup tables to support wide and narrow functions, dual flip-flops to allow for extensive pipelining, and dedicated logic for adders, multipliers, multiplexers, and counters.
Although internally based on an FPGA architecture, the MAX II family is classified by Altera as a CPLD, with an equivalent macrocell count ranging from 240 to 2210 LEs. Like the Lattice devices, the on-chip flash memory holds the configuration pattern for "instant-on" configuration. However, designers added a separate 8-kbit user flash memory. It can be used to hold additional system parameters, which effectively eliminates a small off-chip memory employed by many systems to store setup parameters.
The MAX II devices are relatively low power, consuming only about 2 mA during standby. But the lowest-power FPGAs to date are QuickLogic's new Eclipse II devices, which draw as little as 17 µA of standby current. Chip complexities range from about 47k to 248 kgates and 9 kbits to 46 kbits of embedded memory. The largest device, the QL8325, also features 12 special embedded computational units that pack an 8-bit multiplier, a 16-bit adder, a 17-bit register, several multiplexers, and a 3:4 decoder. As a result, up to 12 8-bit MAC functions can execute per cycle for a total of 1 billion MACs/s when clocked at 100 MHz.
Of course, nearly all FPGA vendors offer previous-generation families. Based on your system performance needs, such devices may also be a good, cost-effective solution.