• Channels
Part Inventory
Go
 
powered by:

 
  • Quick Poll
What Social Networking site do you use the most?



VOTE VIEW RESULTS
Previous Polls

Premium Content

New Signal Chain Technical Papers from Texas Instruments:

 

 

 

Match Your Architecture To Your Application


Print
Reprints Comment Subscribe

System vendors face a number of architecture options as they research next-generation packet-processing technologies to meet future scalability and integration challenges. Two architectures are common: the generic multicore architecture and the special-purpose dataflow architecture.

Each architecture has its strengths. And, as is so often the case, each system vendor’s design decision boils down to the platform’s intended tasks. Essentially, it’s all about mapping architecture to application.

PACKET PROCESSING BACKGROUND

Packet processing, which is data intensive, calls for optimized hardware. In the early days, prior to broadband Internet, general-purpose processors were used for both control session processing and packet processing of the user traffic.

The sharing of central processing unit (CPU) resources between the data and the control planes proved, however, to be significantly difficult to scale as bandwidth requirements grew. For switches and routers, data-plane packet processing was offloaded to custom fixed-function ASICs or programmable network processor units (NPUs). The general-purpose CPU was then freed up and dedicated to control-plane tasks.

Several NPU players have tried to optimize general-purpose processors for layer 2-4 packet processing and offer a multicore architecture with integrated network hardware (i.e., physical layer, media access controller, and table memory) as well as hardware engines for specific tasks (i.e., hashing). At the turn of the 21st century, companies like MMC, C-Port, and the Intel IXP division developed these types of devices.

While there were differences among them, they all shared the same principal architecture. By stripping down the complexity, the processor cores could be simplified to enable tens of processor cores to be integrated into the device, meeting a higher demand for parallelism.

With very few exceptions, these NPU ventures have commercially failed. Ultimately, they couldn’t efficiently meet the processing and memory access requirements in networking applications of greater than 10 Gbits/s.

Now, as we approach 2010, we see a new generation of multicore players addressing the network processing market. While CMOS technology, memory bandwidth, and clock cycle performance have evolved, they still depend on the same principal architecture. Thus, can these new players expect greater success?

This will depend on which type of application they address. Today’s networking nodes not only process packets at layer 2-4, processing is also required at higher levels to support services and add security. Let’s explore the differences and why certain architectures are better than others for any given application.

WIRESPEED PACKET PROCESSING

Layer 2-4 packet processing differs from other network applications (Table 1). First, wirespeed processing for all packet sizes is a key objective. Modern routers and switches are designed with a broad set of network features that service providers expect to be available in parallel without performance degradation.

Second, the data planes view packets as independent entities, allowing for a high degree of parallel processing. For a 100-Gbit/s application, the NPU needs to process 150 million packets every second to guarantee wirespeed performance. A 10-µs delay through the processor corresponds to the concurrent processing of 1500 packets.

Third, data-plane programs require large I/O memory access bandwidth for forwarding table lookups, updates of statistics, and other processes. In high-speed platforms, packet inter-arrival times are very short, putting hard requirements on memory latencies. For small packets, the memory bandwidth to perform these tasks is several times the link bandwidth.

Finally, today’s networks consume a significant amount of power. For both operational cost and environmental reasons, service providers are carefully seeking the highest performance per watt. Given the special characteristics of packet processing, the most efficient architecture should be measured as the highest performance per watt at wirespeed performance.

SERVICE AND SECURITY PROCESSING ATTRIBUTES

Adjacent markets to packet processing are service and security processing. These applications have other characteristics than packet processing at layer 2-4. Consequently, other hardware design optimizations can be made.

These applications terminate and process host-to-host protocols in a client-server manner or operate on re-assembled payload packet data in intermediate network nodes (i.e., firewalls, load balancers, and intrusion and prevention systems). These products must be able to operate across packet borders as they typically need to carry out a larger amount of operations on a broader set of data, resulting in a lower degree of data parallelism. On the other hand, these applications are less classification-intensive, requiring less I/O memory bandwidth relative to processed data.

COMPARING ARCHITECTURES

An NPU promises to provide the performance of a custom ASIC with the programmability of a general-purpose processor. However, comparing processor performance is difficult as theoretical maximum values are often referred to with little real-world relevance. Moreover, performance is impacted by the ability to efficiently use the available processing capabilities, as well as by how well the I/O memory can be utilized in relation to the processing capacity.

Average (0 Ratings):

Subscribe
Subscribe to Electronic Design and start receiving more articles like this one
Filed Under:

Check for price and availability on Source ESB:

Go
powered by  
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
Acceptable Use Policy

Sponsored Links