InfiniBand is on its way to becoming a mainstream switch fabric. The key to implementing InfiniBand is its switch nodes, the individual nodes that make up the warp and woof of the interconnection fabric. The good news is that Red Switch is fielding an eight-port switch node that delivers 160-Gbit/s interconnection bandwidth. This provides a single-chip solution integrating the switch-node logic with the link-port serializer-deserializer (SERDES) and the physical-layer (PHY) interface (Fig. 1).
An InfiniBand fabric consists of end nodesthe servers, peripherals, and subsystems on the edge of the fabric, and the switch nodes that make up the fabric. These switch nodes provide the routing paths through the fabric, connecting end nodes in dynamic point-to-point links. They are the core of the fabric. Each switch node, a concurrent switching mechanism, can support multiple connections. Together, the nodes provide a dynamic link between end nodes, supporting flexible high-bandwidth transactions.
The InfiniBand switch nodes implement the InfiniBand link and transport routing protocols. These nodes support high-speed serial LVDS connections at 2.5 Gbits/s (full-duplex 1.25 GHz in each direction). The connections can be bundled in 1X, 4X, 8X, and 12X configurations. The HDMP-2840 chip delivers the first 32-channel switch node that supports 1X and 4X bundled connections. The 4X bundle combines four 2.5-Gbit/s channels into one 10.0-Gbit/s InfiniBand port. It supports SMA, PMA, and BMA protocols, and enables SNMP management.
The chip is a full InfiniBand switch node implemented in hardware state machines with supporting register sets. It incorporates the SERDES and PHY interfaces (Agilent IP) for single-chip deployment. The chip's logic runs with an internal 250-MHz clock, providing about 20 logic levels between clock edges. This very sophisticated hardware design implements the InfiniBand procedural definitions of the link protocol in register-transfer-level-defined logic and state machines. The four major blocks are:
Link-I/O ports: There are 32 I/O channels. The hardware is configured into eight 4X-link ports. Each of these 4X-port sets can be configured into a 1X-link or one 4X-link bundle. Each 4X set has its own 20-kbyte input buffer and a 5-kbyte output FIFO. Basic link-level protocol processing is handled in state-machine logic at the Link-I/O port.
Arbiter: The central control for the switch node. It acts like a bus arbiter, granting access to a requesting input node to transfer packets through the dynamic crossbar to an output node. It also handles packet routing and virtual-lane (VL) assignments.
Crossbar switch: The dynamic crossbar that connects all ports to one another, providing multiple concurrent connections. It also connects the ports to a management port and a test port.
Management port: It handles InfiniBand device management functions. Also, it includes a separate bus port to an optional external CPU, plus an I2C serial port. It provides access for a local management controller and supports subnet, performance, and baseboard management packets.
Switch-Node Applications The chip can be deployed as InfiniBand switch nodes that make up the fabric core. Each switch node has a throughput delay of roughly 100 ns with a 95% throughput efficiency. So the fundamental limit, aside from Local ID (LID) addressing, is the amount of latency that a system can tolerate. Systems of 16,383 nodes can be implemented. Each switch node links to other switch nodes, or to end nodes. The link protocol is the same for both links.
Another application is using the switch for a high-performance peripheral system backplane (Fig. 2). This approach is particularly effective for storage systems made up of multiple storage subsystems, like RAID, JOB, and others.
The 4X ports can move data at 10 Gbits/s or 1.25 Gbytes/s between the switch and any storage peripheral, a bandwidth greater than any standard bus backplane. It can support concurrent transactions between peripherals, along with an I/O connection for intrasystem access to the peripherals.
An InfiniBand-based peripheral system would integrate a host CPU with its memory, and an HDMP-2840 switch-node chip with up to three storage peripheral subsystems and an I/O subsystem (four 10-Gbit/s links). This system supports high-bandwidth data transfers from any of the peripheral subsystems to an I/O subsystem at 10 Gbits/s.
Moreover, this I/O subsystem can support multiple I/O connections, such as SCSI, Ethernet, and FibreChannel, to external systems and networks. The CPU links to the switch node and peripherals via the node's CPU interface. Using multiple nodes, designers can build a multilayer switch backplane supporting more peripherals.
Or, designers can use these switch nodes to implement line-card backplanes. A single switch node supports up to eight line-card links, each with a 10-Gbit/s (1-Gbyte/s) peak bandwidth link. More complex backplane switch fabrics can be done using multiple switch levels.
For serial switch backplanes, InfiniBand's advantage lies in building on an existing standard, rather than on a proprietary switch backplane. Better yet, this switch-fabric technology can be deployed at a higher level to link the line-interface box to the system or to other systems.