• Channels
Part Inventory
Go
 
powered by:

 
  • Quick Poll
What Social Networking site do you use the most?



VOTE VIEW RESULTS
Previous Polls

Premium Content

New Signal Chain Technical Papers from Texas Instruments:

 

 

 

Fabric Acceptance


Ray Alderman

January 12, 2006

Print
Reprints Comment Subscribe

Looking into the future, it's clear which application segments will adopt which fabrics more readily.

Telecom equipment makers buy commodity technologies from commodity board makers. This is where PCI Express and ASI will see acceptance in low-level edge equipment. As 10-Gigabit Ethernet develops, it will become a commodity, and many other telecom applications will adopt it.

RapidIO will find acceptance in high-level deterministic critical applications across many application segments. Fire-control systems in military, deterministic applications in industrial controls, and telecom billing systems are most prevalent. The only critical application in telecom is the billing system: Your calls can be dropped, and you can't get any ?bars? on your cell phone. But be assured that you won't get an extra minute on your call plan, and there's no way you'll make a free long-distance phone call.

InfiniBand, which will be heavily accepted in clustered Linux servers, is an excellent technology for ?streaming I/O? interfaces. Military radar and sonar are perfect examples of critical streaming I/O applications. Clustered Linux servers aren't deterministic or used in critical applications, but the message-passing mechanisms in InfiniBand make it a very clean and efficient method of hooking up large multiprocessing systems.

As the fabric technologies mature, each fabric technology will move across the end-market segments, depending on the requirements of each application. But, each fabric will enjoy adoption in specific applications where the other fabrics fall short.

The Problems With Fabrics
Ever since two people hooked up a couple of tomato cans with a string and talked to each other, we've been enchanted with serial connections in computers. Serial connections have some major advantages. They require a minimum of connector pins, they use less power than parallel buses, and they're simple to use and design. But serial has always lagged behind parallel interconnects in performance.

We used regular single-ended logic to implement serial interconects many years ago. We moved to emitter-coupled logic to get more speed, and we adopted differential connections when the single-ended logic ran out of signal-to-noise margin.

We were told CMOS would never reach gigabit frequencies, so we continued to make parallel buses run faster (with things like incident-wave switching). We also made them wider (i.e., 16 bits, 32 bits, 64 bits, etc.) to increase their bandwidth.

But CMOS advanced to gigabit speeds. Low-voltage differential-signaling logic was developed and refined. Differential pairs made them noise-resilient. And today, we have a host of high-speed serial differential technologies called fabrics. These ?fabrics? promise to revolutionize computer architecture over the next few years. But a few problems, mostly software, must be concurrently solved.

In the past, serial connections were primarily used as ?I/O channels.? A communications processor aggregated the I/O devices. That processor took the I/O traffic in and translated it to parallel data for the main CPU to handle. The same happened on the outgoing I/O traffic, but in reverse.

Creating I/O channels is pretty easy. PCI Express is all about changing the PCI-based parallel I/O to serial streams. Serial interconnects also can enable many different and efficient multiprocessing architectures, and that's where the trouble starts.

We're familiar with the linear models for buses like VME and PCI. You can calculate exactly what is going to happen in specific periods of time on every transaction with these linear-model buses. But fabrics are statistical models. What happens with one transaction depends on what else is occuring at the switches or on the nodes. The statistical nature of fabrics develops from the fact that all transactions on a fabric are split transactions, not continuous transactions like we find on buses.

Two topologies exist for fabric architectures: switched or mesh. In switched architectures, a central switch controls and routes traffic to and from the other nodes. This is where things get complicated. You can have single stars, double stars, Clos switches, and many other topologies. A mesh is an all-to-all topology. Every node in the network has dedicated channels to and from every other node in the architecture. This is very simple from a hardware and software standpoint, as there are few computer-science problems to deal with.

You can create three multiprocessing architectures with the high-speed serial fabrics. This is where we find the problems:

  • Tightly coupled/shared-everything (TCSE)
  • Snuggly coupled/shared-something (SCSS)
  • Loosely coupled/shared-nothing (LCSN)

When you look at the top fabric architectures in the market today (i.e., RapidIO, InfiniBand, PCI Express, and Ethernet), each fits into one of the above architectures, and each architecture has its own aberrant behavior.

TCSE
The only fabric that behaves like a tightly coupled/shared-everything architecture is Serial RapidIO. The RapidIO protocols have peer-to-peer mechanisms that enable every processor in the network to directly access every I/O device in the network, regardless of where it resides. TCSE architectures are somewhat deterministic and reasonably predictable. The peer-to-peer protocols tightly couple all devices, and every processor in the network can access every device in the network directly. RapidIO is the only fabric that can behave in this manner.

SCSS
In this architecture, the fabric offers no peer-to-peer mechanisms in its protocol stack. To access any of the I/O devices in the network, one processor communicates with the processor controlling that device locally (interprocessor communications). This is accomplished by sending ?messages? to each other.

These messages go into a shared element, either memory or disk (hence, the shared ?something?). The receiving processor reads the messages, accomplishes the requests, and sends back the answers. SCSS architectures are essentially message-passing systems with shared memory between the processors. This diminishes the determinism of the architecture. It's not even close to real time, unlike TCSE architectures.

This message-passing structure causes some interesting computer-science problems for the software folks. InfiniBand and PCI Express (Advanced Switching) both behave like SCSS architectures. Remember that InfiniBand has a remote DMA (RDMA) mechanism that sends ?messages? to the other node's memory, and you can see the ?shared-something? aspects of this architecture.

All transactions on an SCSS architecture are split transactions. When one processor sends data or a request to another processor, you have no guaranteed-delivery mechanism to ensure that the data ever arrives. If you write data to another CPU's memory, that processor must send back an ?ACK,? alerting the sender that it arrived.

If you're reading some memory location in another processor's space, the ?ACK? is the transaction that returns the data. So in this type of architecture, some applications code must reconcile all outstanding transactions, just as you reconcile your checkbook at the end of the month. Buses accomplish this with their handshake lines on writes.

On reads, you accomplish the transaction in one bus session, or there can only be one split transaction outstanding at any time. This is how PCI bridges handle the problem.

Looking into the future, it's clear which application segments will adopt which fabrics more readily.

Telecom equipment makers buy commodity technologies from commodity board makers. This is where PCI Express and ASI will see acceptance in low-level edge equipment. As 10-Gigabit Ethernet develops, it will become a commodity, and many other telecom applications will adopt it.

RapidIO will find acceptance in high-level deterministic critical applications across many application segments. Fire-control systems in military, deterministic applications in industrial controls, and telecom billing systems are most prevalent. The only critical application in telecom is the billing system: Your calls can be dropped, and you can't get any ?bars? on your cell phone. But be assured that you won't get an extra minute on your call plan, and there's no way you'll make a free long-distance phone call.

InfiniBand, which will be heavily accepted in clustered Linux servers, is an excellent technology for ?streaming I/O? interfaces. Military radar and sonar are perfect examples of critical streaming I/O applications. Clustered Linux servers aren't deterministic or used in critical applications, but the message-passing mechanisms in InfiniBand make it a very clean and efficient method of hooking up large multiprocessing systems.

As the fabric technologies mature, each fabric technology will move across the end-market segments, depending on the requirements of each application. But, each fabric will enjoy adoption in specific applications where the other fabrics fall short.

The Problems With Fabrics
Ever since two people hooked up a couple of tomato cans with a string and talked to each other, we've been enchanted with serial connections in computers. Serial connections have some major advantages. They require a minimum of connector pins, they use less power than parallel buses, and they're simple to use and design. But serial has always lagged behind parallel interconnects in performance.

We used regular single-ended logic to implement serial interconects many years ago. We moved to emitter-coupled logic to get more speed, and we adopted differential connections when the single-ended logic ran out of signal-to-noise margin.

We were told CMOS would never reach gigabit frequencies, so we continued to make parallel buses run faster (with things like incident-wave switching). We also made them wider (i.e., 16 bits, 32 bits, 64 bits, etc.) to increase their bandwidth.

But CMOS advanced to gigabit speeds. Low-voltage differential-signaling logic was developed and refined. Differential pairs made them noise-resilient. And today, we have a host of high-speed serial differential technologies called fabrics. These ?fabrics? promise to revolutionize computer architecture over the next few years. But a few problems, mostly software, must be concurrently solved.

In the past, serial connections were primarily used as ?I/O channels.? A communications processor aggregated the I/O devices. That processor took the I/O traffic in and translated it to parallel data for the main CPU to handle. The same happened on the outgoing I/O traffic, but in reverse.

Creating I/O channels is pretty easy. PCI Express is all about changing the PCI-based parallel I/O to serial streams. Serial interconnects also can enable many different and efficient multiprocessing architectures, and that's where the trouble starts.

We're familiar with the linear models for buses like VME and PCI. You can calculate exactly what is going to happen in specific periods of time on every transaction with these linear-model buses. But fabrics are statistical models. What happens with one transaction depends on what else is occuring at the switches or on the nodes. The statistical nature of fabrics develops from the fact that all transactions on a fabric are split transactions, not continuous transactions like we find on buses.

Two topologies exist for fabric architectures: switched or mesh. In switched architectures, a central switch controls and routes traffic to and from the other nodes. This is where things get complicated. You can have single stars, double stars, Clos switches, and many other topologies. A mesh is an all-to-all topology. Every node in the network has dedicated channels to and from every other node in the architecture. This is very simple from a hardware and software standpoint, as there are few computer-science problems to deal with.

You can create three multiprocessing architectures with the high-speed serial fabrics. This is where we find the problems:

  • Tightly coupled/shared-everything (TCSE)
  • Snuggly coupled/shared-something (SCSS)
  • Loosely coupled/shared-nothing (LCSN)

When you look at the top fabric architectures in the market today (i.e., RapidIO, InfiniBand, PCI Express, and Ethernet), each fits into one of the above architectures, and each architecture has its own aberrant behavior.

TCSE
The only fabric that behaves like a tightly coupled/shared-everything architecture is Serial RapidIO. The RapidIO protocols have peer-to-peer mechanisms that enable every processor in the network to directly access every I/O device in the network, regardless of where it resides. TCSE architectures are somewhat deterministic and reasonably predictable. The peer-to-peer protocols tightly couple all devices, and every processor in the network can access every device in the network directly. RapidIO is the only fabric that can behave in this manner.

SCSS
In this architecture, the fabric offers no peer-to-peer mechanisms in its protocol stack. To access any of the I/O devices in the network, one processor communicates with the processor controlling that device locally (interprocessor communications). This is accomplished by sending ?messages? to each other.

These messages go into a shared element, either memory or disk (hence, the shared ?something?). The receiving processor reads the messages, accomplishes the requests, and sends back the answers. SCSS architectures are essentially message-passing systems with shared memory between the processors. This diminishes the determinism of the architecture. It's not even close to real time, unlike TCSE architectures.

This message-passing structure causes some interesting computer-science problems for the software folks. InfiniBand and PCI Express (Advanced Switching) both behave like SCSS architectures. Remember that InfiniBand has a remote DMA (RDMA) mechanism that sends ?messages? to the other node's memory, and you can see the ?shared-something? aspects of this architecture.

All transactions on an SCSS architecture are split transactions. When one processor sends data or a request to another processor, you have no guaranteed-delivery mechanism to ensure that the data ever arrives. If you write data to another CPU's memory, that processor must send back an ?ACK,? alerting the sender that it arrived.

If you're reading some memory location in another processor's space, the ?ACK? is the transaction that returns the data. So in this type of architecture, some applications code must reconcile all outstanding transactions, just as you reconcile your checkbook at the end of the month. Buses accomplish this with their handshake lines on writes.

On reads, you accomplish the transaction in one bus session, or there can only be one split transaction outstanding at any time. This is how PCI bridges handle the problem.

Average (0 Ratings):

Subscribe
Subscribe to Electronic Design and start receiving more articles like this one
Filed Under:

Check for price and availability on Source ESB:

Go
powered by  
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
Acceptable Use Policy

Sponsored Links