Understanding FPGA Processor Interconnects

Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.

FPGAs are wonderful tools. They consist of a collection of logic cells called lookup tables (LUTs) surrounded by an interconnect fabric. The LUTs and fabric are programmable, providing a flexible system that can implement almost any digital algorithm. However, FPGAs offer tradeoffs compared to ASICs.

ASICs are more compact and power efficient than FPGAs. FPGA designers have made major improvements in both areas, but a LUT with its configuration storage is simply larger than the logic it will be programmed to implement. Likewise, the interconnect fabric consists of more gates than would be necessary with a direct connection between logic elements.

On the other hand, ASICs are fixed. They can be programmable but at the software level. For example, microcontrollers and microprocessors are essentially standardized ASICs. FPGAs are programmable at the logic level, providing faster, parallel implementations that a processor cannot match. Replacing an ASIC in the field is usually impractical. Reprogramming is common for processor-based ASICs, but there are performance and functional limitations that FPGAs easily exceed.

Programming is the other challenge. Programming or designing logic configurations for an FPGA is significantly different from writing software for a processor. Many high-level FPGA and ASIC design languages like SystemC are based on the C programming language. But there’s a definite mindset for developing FPGA designs using these tools that’s not the same for writing applications for a processor.

Hard Or Soft Cores For FPGAs

The flexibility and ease of design software for processors is well established, so it’s not surprising that processors are part of the intellectual property (IP) mix for FPGAs. These processors can be soft cores implemented using LUTs or hard-core ASIC-style cores.

Soft cores have the advantage of being completely soft in terms of implementation. The only requirement is a sufficient number of FPGA resources to implement the core. The core can be any architecture with arbitrary bus widths and other features that are fixed on an ASIC-style processor.

FPGA vendors typically provide their own soft-core processors optimized for their FPGA hardware. These processors often feature small tweaks compared to their predecessors and provide better performance, lower system requirements, or other advantages.

Soft-core processors from FPGA vendors include Xilinx’s microBlaze and Altera’s NIOS II. These processors have 32-bit architectures, but there are also 8- and 16-bit processor cores like the venerable 8051.

Designers using soft cores often can configure features such as caching and memory protection by simply selecting items from a menu or changing numeric settings such as the size of a cache. The Arm Cortex-M1 is a 32-bit soft-core design that has been ported to all major FPGA platforms (see “FPGAs Pushing MCUs As The Platform Of Choice”).

Soft-core processors offer a number of advantages over conventional microcontrollers. They can implement more powerful peripherals and are soft in terms of hardware and software. Flash-based FPGAs can even start up as quickly as most microcontrollers in the same class. RAM-based FPGAs take a little longer to start up.

Hard-core processors have the advantage when it comes to performance and power over soft-core designs. Xilinx incorporated PowerPC cores in earlier, high-end Virtex-4 FPGAs. The latest FPGAs with hard cores use the Arm architecture.

Xilinx’s Zynq is built around a pair of Arm Cortex-A9 processor cores (Fig. 1). Altera has also chosen the Cortex-A9 for inclusion in its FPGA line. Tablets and smart phones were based on this platform but are moving to quad-core designs like Nvidia’s Tegra 3, which is found in Google’s Nexus 7 tablet (see “iFixit Tears Down The Google Nexus 7 Tablet”).

1. Xilinx’s Zynq-7000 EPP includes a dual Cortex-A9 along with a complete set of peripherals. It uses AXI to connect to soft FPGA peripherals.

Still, products like the Zynq aren’t chasing these markets, and the hard-core FPGAs have other advantages. In particular, the chips aren’t just hard-core processors. Most of the chips are normally conventional FPGA fabrics.

Fixed peripherals that are conventionally found on microcontrollers and microprocessors normally surround the hard-core processors. These peripherals include interfaces like serial ports, parallel ports, and Ethernet ports. As part of an FPGA, additional interfaces are typically an IP block away, but more on that later.

The complement of hard peripherals varies depending upon the chip and target application. Microsemi’s SmartFusion mixed-signal FPGA includes a substantial programmable analog block (Fig. 2). This block is programmable from the point of view of a state machine or simple processor rather than an FPGA fabric or another full-blown processor core.

2. Microsemi’s SmartFusion FPGA is based on an Arm Cortex-M3 hard core that uses AMBA AHB.

SmartFusion is a natural complement to the hard Cortex-M3 core, which is the basis for a wide range of microcontrollers. It is a lower-end platform compared to the Cortex-A9, but both platforms complement their target application areas.

Ideal for control applications, the SmartFusion FPGA fabric often is used to provide custom interface support. The Cortex-A9s tend to be used in high-end applications where the FPGA is doing the heavy lifting and the cores provide a platform for communication software.

Implementing a Web server and TCP/IP stack is elementary in software. It’s possible to do all that in FPGA logic (not counting a soft-core processor), but that is a more than a major task. A TCP/IP off-load engine (TOE) is a partial example, and designing one is a significant chore undertaken by very few in the high-end network control space. TOEs are fast, but that usually isn’t necessary in these instances where it is simply a matter of providing software access to network services.

The FPGA fabric often is better utilized implementing features such as hardware video compression to reduce the amount of data sent across the network. The question is whether these hard-core FPGAs are microcontrollers with an FPGA fabric or an FPGA with an embedded hard core. The Virtex-4 was definitely an FPGA with an embedded hard core because the cores were very limited in their peripheral complement.

The Zynq and SmartFusion are at the other extreme with a full complement of peripherals. In fact, they can run independently of the FPGA fabric, though that tends to be impractical in terms of applications because the processors have direct access to peripherals implemented in the FPGA fabric.

Of course, it’s possible to move even further in terms of isolating the FPGA fabric. Intel’s E600C multichip carrier integrates an Intel Atom core with an Altera FPGA (Fig. 3). The two are linked via PCI Express, a standard interface implemented on both platforms. The Atom has a x1 PCI Express host interface, and the FPGA has a x1 PCI Express client interface, which provides a link to the FPGA fabric. Actually, there are multiple x1 PCI Express links between the two nodes.

3. Intel’s E600C Atom processor is a multichip package that includes an Actel FPGA connected to the Atom via PCI Express links.

In the E600C, the software uses PCI Express to access devices implemented on the FPGA fabric, but it’s only a simple interface if a single device is on the other end of the connection. Dealing with multiple logical devices requires a good bit of customization and depends upon what IP is utilized on the FPGA.

The E600C is the exception to hard-core FPGA solutions, where ARM cores rule. Yet the E600C does have the advantage when it comes to interfacing IP to the hard cores.

AMBA: One Specification To Rule Them All

Arm cores utilize the Advanced Microcontroller Bus Architecture (AMBA). Arm doesn’t make chips. It licenses its design, and the cores are found in a wide range of custom ASICs and standard microcontroller and microprocessor products. These designs incorporate AMBA and its underlying protocols.

The latest AMBA 4.0 specification defines five interfaces:

Advanced eXtensible Interface (AXI)
Advanced High-performance Bus (AHB)
Advanced System Bus (ASB)
Advanced Peripheral Bus (APB)
Advanced Trace Bus (ATB)

AXI targets system designs with high clock frequencies. It has separate address/control and data phases and supports non-aligned data transfers using byte signals. Hosts can issue multiple addresses for more efficient bus utilization, and burst-based transactions only need to supply the start address. The AXI architecture allows additional register stages so designers can provide timing closure.

AXI is a host/client interface that can be extended using a switch or fabric (Fig. 4). The AXI interconnect can be implemented in a number of ways with varying levels of performance and complexity. Interconnects can support one or more AXI masters. Obviously, a single master interconnect will be easier and less complex to implement.

4. Advanced eXtensible Interface (AXI) has a host-client interface that can be extended using an AXI interconnect that’s typically a switch.

AXI-Lite provides a lightweight version of AXI for devices that do not need the full AXI functionality, so simpler interfaces can be utilized. The AXI Coherency Extensions (ACE) suit cache coherency support. There is an AXI-Lite variant as well. Not all AXI devices need this support, but having it defined in the standard provides a clear design path when it is needed. AXI and ACE are the latest additions to the AMBA specification.

AXI and ACE target multicore MPCore implementations such as Arm’s Cortex-A5 MPCore, Cortex-A7 MPCore, Cortex-A9 MPCore, Cortex-A11 MPCore, and Cortex-A15 MPCore. The Cortex-A7 MPCore and Cortex-A15 MPCore are the basis for Arm’s Big.LITTLE combination, where the Cortex-A7 provides a low-power platform and the Cortex-A15 runs when high performance is required (see “Little Core Shares Big Core Architecture”).

AHB was originally introduced with AMBA 2, and the AHB-Lite version was added in AMBA 3. It is found in many Arm platforms including SmartFusion’s Cortex-M3 implementation. AHB’s single-edge clock protocol supports multiple bus masters, split transactions, burst transfers, and single-cycle bus master handover. The interface also has a non-tristate, pipelined architecture. The design handles bus widths up to 128 bits. The AHB-Lite subset targets a single master that is typical with microcontrollers.

APB connects to AHB and provides a low bandwidth and register interfaces for peripherals like serial ports. It is similar to AHB but less complex, requiring fewer transistors to implement. It doesn’t support transactions such as burst mode and targets smaller width interfaces.

FPGA Software Development Tools

AXI isn’t the only alternative for an interface specification. The Open Core Protocol (OCP) is a configurable interface specification from the Open Core Protocol International Partnership (OCP-IP). Altera’s Avalon memory-mapped communication fabric was part of Altera’s NIOS II soft-core implementation. These interfaces have been used in various FPGA and ASIC designs. The NIOS II would only be used in Altera FPGAs.

The move toward ARM cores in FPGAs is one reason AXI is becoming more important to FPGA developers. The AXI architecture is the same that would be utilized in ASIC designs, so AXI-related application programming interfaces (APIs) tend to be more suitable for FPGA and ASIC use.

FPGA design tools account for this trend. Altera’s Quartus II FPGA design tools support the Avalon and AMBA AXI interfaces. Microsemi’s Libero supports Microsemi’s FPGA and programmable device lines. Only the SmartFusion line has a hard core, and the Cortex-M3 uses AHB. Xilinx has two FPGA design tools, the ISE and Vivado (see “FPGA Design Suite Generates Global Minimum Layout”). Both support AXI, but Vivado takes AXI support to a new level.

AXI becomes more important as modular construction of an FPGA design comes into play. All the FPGA tools including Libero can handle AXI, but this is minimally from a logic standpoint where the designer must knit the various components together. This is the same kind of chore that must be completed with custom IP.

The alternative is to use building blocks or component approaches that conform to a standard. The methodology isn’t unique, and component-based construction is common. The challenge for designers and component makers, though, is compatibility.

Component IP addresses a particular function, and the controlling interface is specific to the component. The IP can be molded into any other interface with additional logic, but a separate component essentially is needed for each kind of interface.

Designers would like to target an interface that provides the best payback. Minimizing the number of targets to one is preferable from a support standpoint since each target needs to be configured and verified. More targets mean more work.

This is where AXI plays a role. It provides a standard interface for this modular approach. Tools like Xilinx’s Vivado allow AXI-based components to be added to a design, and all the interconnect logic is added automatically. Designers may need to fill in a few configuration choices, but the bulk of the interconnect work is done.

The approach is ideal for designers who don’t focus on FPGA hardware such as software developers who want the flexibility and power of an FPGA but will be doing the bulk of their work programming an application that will access the components included in the mix.

Development tools can take this component approach further by tying it into the software development side. For example, each component usually has a matching C header file since most embedded FPGA software applications are programmed in C and C++. The development tools can collect the header files from the various components in a design, so the developers have all of them available for application development. Likewise, any configuration options can be embedded in these files as well.

Part of the challenge of using AXI is that the interface is very extensible, but not all devices require the full implementation. For example, most peripherals don’t need ACE support. Sideband singles tend to be different for IP for video or DSP applications. Any tool that automatically utilizes these components needs to account for these differences.

System Verification

Menu selection of components for an AXI-based hard or soft core is a good starting point and often sufficient for some developers, but others need to take the job a step or many steps further, possibly including more detailed verification of the design. This is possible because there is a standard bus functional model (BFM) for AXI.

Xilinx has incorporated an AXI BFM into its Vivado tool suite. The BFM enables the testing and verification of AXI components so developers can be certain they will work properly when they’re selected from a menu of components that provide peripheral support for the hard or soft cores to develop and apply.

Verification tends to be a bigger issue for ASIC designers because they often use more custom IP and because their design will be fixed in silicon. Companies like Cadence and Synopsys have extensive design verification suites. Xilinx worked with Cadence to incorporate a subset for its AXI BFM (Fig. 5). They’re the same tools that Cadence provides but incorporated into the Vivado FPGA tool suite.

5. Xilinx includes a subset of Cadence’s design verification support for its AXI bus functional model (BFM) that allows designers to test their AXI interface.

Cadence targets FPGAs as well as ASICs, but Vivado is dedicated to Xilinx FPGAs. Vivado supports the IEEE P1735 IP encryption standard that allows developers to deliver a secured and verified design in the form of executable cores. The Cadence tools address areas such as testing. Xilinx’s support is usually sufficient for cost-sensitve FPGA applications, although some designers may want to employ some or all of the components within Cadence’s verification stack to improve their components.

The modular approach to FPGA design is easier for designs based on hard and soft cores because the processors dictate fixed access and control strategy. Standard interfaces like AXI embody these strategies and make it easier to design components and utilize them in an application.

AXI isn’t the only interface being used in FPGA component-based design, but it is significant because of the hard-core and soft-core Arm-based options available to designers.

Many soft-core solutions compete with the Arm-based alternatives, and they often have their own complement of peripherals. The 8051 has a set of 8051 peripherals that are typically found in microcontrollers, and they have been replicated for FPGA use. Existing software and libraries then can be used to support the custom FPGA-based designs. Freescale’s ColdFire V1 has its own Freescale devices that can be part of a standard package (see “Climb On Board Next-Generation FPGAs”). Of course, like other FPGA solutions, the V1 can be linked to custom IP as well.

Most FPGA designs now incorporate one or more hard-core or soft-core processors. They all need peripheral support and access to the FPGA fabric. The job of creating these soft platforms on FPGAs is getting easier due to standard interface specifications.