Embedded signal processors traditionally have been built with clusters of general-purpose floating-point processors. FPGAs usually have been used on the edges of the cluster to perform signal conditioning, while the hard processing work was reserved for the PowerPCs. Current technology makes it possible to do significant, useful signal processing work in clusters of tightly connected FPGAs instead, and this approach has significant advantages over general-purpose processors.
A general-purpose processor’s biggest limitation is the imbalance between I/O and processing. There is only one path to and from main memory. The processor can perform complex calculations faster than it can fetch operands or store results. The result is that many algorithms wind up I/O bound, limited by the access speed rather than the calculation times.
Advanced processors with on-chip caches can do better, but only if the next operand is already in the cache. It often isn’t, so the operation stalls while it is fetched. These processors work best on problems that have a high ratio of processing steps to I/O points, such as long fast Fourier transforms (FFTs). General-purpose processors aren’t a good solution for short FFTs, or finite impulse response (FIR) filters, since each input point is used in just a few calculations. The processor spends most of its time waiting for the memory interface.
In an FPGA, in contrast, an application can have many simple signal processing streams running in parallel. An FPGA can process many signals in the time a general-purpose processor would do one. A typical FPGA can have hundreds of I/O pins, as well as thousands of logic blocks that can be used to implement FIR filters. The ratio of processing power to input-output is much better balanced for signal processing.
Most general-purpose processors do arithmetic operations fastest on 32-bit floating-point values, even if the application doesn’t require that. FPGAs can be more closely tailored to the application, which usually saves gates and memory cells and ultimately power, weight, and system size.
FPGAs perform best in applications that take large amounts of input data and process it in a chain of operations that are the same for all the input points. They don’t suit “if-then-else” processing where the operations change depending on input values or intermediate results. But most signal processing algorithms don’t change with input values, making them ideal for implementation in an FPGA. FPGAs are very good at functions such as FIR filters or FFTs, which involve large numbers of relatively simple multiply and add operations.
In the past, it was cumbersome to connect FPGAs together in multichip systems or in stars, meshes, or rings. Current technologies make this much easier. Several large FPGAs will now fit on 6U VME boards, and new interconnect technologies provide high-speed, low-overhead connections chips and between boards. A cluster of FPGAs could be used to implement multichannel digital downconverters (DDCs), multichannel demodulators, or one- or two-dimensional correlators. Given the high-speed interconnect technologies now available, it’s possible to build multichip systems to do two-dimensional FFTs (2dFFTs) and SAR processing.