Consider An ATCA-Based Design For Carrier-Grade OSs

Dec. 18, 2003
The Advanced Telcom Computing Architecture offers designers coveted features such as high speed, scalability, open standards, and robust system management.

DESIGN VIEW is the summary of the complete DESIGN SOLUTION contributed article, which begins on Page 2.

Managing the design and development of a carrier-grade operating system (CGOS) is a major undertaking at any time. However, it's even more difficult in these economically challenging times, when lean teams race to meet looming deadlines under restrictive budgets that have been stretched thin. Add to this mix the plethora of emerging switched-fabric architecture options.

Where does one begin? Commercial off-the-shelf (COTS) standards-based hardware and software is one objective, because it's cost-effective and helps meet time-to-market. However, currently available COTS products don't offer a viable solution due to various performance-limiting drawbacks, such as insufficient board space, narrow board spacing, limited backplane throughput, and lack of scalability.

The advent of the Advanced Telecom Computing Architecture (ATCA) offers compelling reasons to select it as the platform for a CGOS: high-speed scalability to 2.5 Tbits/s, high availability, open standards, robust system-management features, scalability, and cost-effectiveness.

Based on PICMG 3.0, the ATCA shelf has up to 14 slots in a standard 19-in. rack, or up to 16 slots in a 23-in. or ETSI rack. Other features include front boards with a form factor of 8U by 280 mm, 1.2-in. (6HP) board spacing with a 0.1-in. board offset, a high-speed (5-Gbit/s) connector, and cooling for up to 200 W per slot. ATCA PICMG subspecifications exist for Ethernet, InfiniBand, StarFabric, and PCI Express.

This article focuses on ATCA's shelf mechanicals, including the backplane. Adopting a step-by-step design methodology and using a comprehensive specification like PICMG 3.0 for ATCA can ultimately be a springboard to the successful development of a CGOS.

HIGHLIGHTS:
Selecting A Thermal-Management Solution Cooling is a major concern with an ATCA shelf design. Dissipating 200 W per slot for a total of 2800 W in a 19-in. rack-mount shelf is no small feat when employing air-moving devices like fans and blowers. Thus, the following decisions should be made before arriving at a thermal solution:
  • Board and shelf dissipation
  • Impedance to airflow offered by the boards
  • Air-filter requirement
  • Maximum chassis height
  • Redundancy characteristics in cooling
Topology Considerations The key topologies of the ATCA specification are Dual-Star, Dual Dual-Star, and Mesh. Ultimately, the topology can have a great impact on overall system cost because the cards, backplane, etc., will be affected.
Shelf Management Shelf managers developed for the PICMG 3.0 specification use the Intelligent Platform Management Interface (IPMI). Incorporating a shelf manager that individually controls the fans in the chassis will help maximize efficiency. For redundancy, the shelf manager can be designed as dual units in a hot-swap redundant mode. An interface board can be used for direct shelf-manager plugging.

Full article begins on Page 2

Managing the design and development of a carrier-grade operating system (CGOS) is a major undertaking at any time. But it’s even more difficult in these economically challenging times, when lean teams race to meet looming deadlines under restrictive budgets that have been stretched thin. Add to this mix the plethora of emerging switched-fabric architecture options pulling design engineers in different directions, and one begins to comprehend the complexities inherent in the task.

Where does one begin? At the very least, the objective should be to develop a carrier-grade platform using commercial off-the-shelf (COTS), standards-based hardware and software. The designer’s goal here is to implement an interoperable cost-effective solution for next-generation networks, while meeting time-to-market needs.

Unfortunately, currently available COTS products don’t offer a viable solution due to a host of performance-limiting drawbacks. Examples include insufficient board space to package the requisite functionality; narrow board spacing (pitch); limited backplane throughput; demanding levels of signal integrity and electromagnetic compatibility (EMC); inadequate system management modules (both hardware and software); and lack of scalability in capacity, reliability, and performance. The advent of the Advanced Telecom Computing Architecture (ATCA) offers compelling reasons to select it as the platform of choice for such a CGOS. ATCA offers the following:

  • High-speed scalability to 2.5 Tbits/s
  • High availability: reliability, availability, and serviceability (RAS) functionality by virtue of redundancy, failover, fault prediction, and prevention
  • Open standards
  • Interoperable third-party products contributing to a dynamic ecosystem
  • Robust system-management features
  • Scalability and cost effectiveness

Based on the PICMG 3.0 specification, which was ratified on Dec. 30, 2002, the following salient features pertain to an ATCA shelf:

  • Up to 14 slots in a standard 19-in. rack, or up to 16 slots in a 23-in. or ETSI rack
  • Front boards with a form factor of 8U by 280 mm
  • RTMs with a form factor of 8U by 70 mm
  • 1.2-in. (6HP) board spacing with a 0.1-in. board offset
  • High-speed (5-Gbit/s) connector
  • Cooling for up to 200 W per slot
  • Simplified sheet-metal construction
  • −48-V (central office) input power
  • Meets Network Equipment-Building System (NEBS) criteria
  • Mandatory Intelligent Platform Management Interface (IPMI)-based system management

There are currently four sub-specifications for ATCA: PICMG 3.1 for Ethernet, PICMG 3.2 for Infiniband, PICMG 3.3 for StarFabric, and PICMG 3.4 for PCI Express. In addition, a new sub-specification for RapidIO over ATCA, PICMG 3.5, is in the works.

This article will focus on the shelf mechanicals, including the backplane. Once the decision is made to pursue an ATCA-based architecture for the CGOS, then the following parameters must be identified:

  • Number of fabric slots and node slots needed
  • Total power to be dissipated
  • For development or deployment
  • Redundancy of field-replaceable units (FRUs)
  • Fabric topology

The chassis height, board orientation, and thermal scheme are all predicated on the number of fabric and node slots. Obviously, a fewer number of slots makes it possible to package them in a smaller chassis by orienting the boards horizontally and ensuring a side-to-side airflow. This maximizes the number of shelves that a cabinet can accommodate and lends scalability to the solution while keeping costs low. Whenever vertical orientation is chosen, the height of the chassis will also be influenced by the boards’ total power dissipation. If they dissipate close to 200 W per slot, then a taller chassis with intake plenums, and exhaust plenums should be considered. If the requirement calls for redundancy of FRUs, such as the fan trays, this will further affect the height and the thermal scheme.

A development ATCA chassis can be a valuable tool during the prototype and testing stages, when the switch cards and other devices have to be thoroughly tested and debugged. For such applications, a chassis with an integrated −48-V dc power supply is very convenient for the R&D engineer. The supply can be located on a tabletop with access to ac power. Figure 1 shows an example chassis.

The chassis must feature rugged construction. It should meet the shake and roll of a Zone 4 earthquake, while simultaneously providing the mechanical framework with tolerances to house the ATCA boards. It’s advisable to refer to PICMG 3.0 specifications for all of the do’s and don’ts inherent in designing an ATCA-based platform.

Selecting An Optimum Thermal-Management Solution Cooling is a major concern with the design of an ATCA shelf—and quite justifiably so. Dissipating 200 W per slot for a total of 2800 W in a 19-in. rack mount shelf is no small feat when employing air-moving devices like fans and blowers. Added to this is the NEBS-grade functionality that requires sufficient cooling at a high ambient temperature of 50° C. The analytical step-by-step approach briefly explained here is one of the tried and tested methods of identifying and adopting an optimum thermal solution. The following decisions should be made before arriving at a thermal solution:
  • Board and shelf power dissipation
  • Impedance to airflow offered by the boards
  • Air-filter requirement
  • Maximum chassis height possible
  • Redundancy requirements in cooling

Then, a preliminary calculation of the volumetric airflow in cubic feet per minute (cfm) necessary to dissipate the total power should be done based on the 10° rise rule of thumb.

Note that it’s very important to have an understanding of the system impedance. This impedance is a function of the combined pressure losses due to the air-intake vents, air filter, number of turns in the air-flow path, and the board topology. Densely populated chassis like ATCA can be expected to exhibit a higher static pressure build, which can restrict airflow by as much as 60%.

It’s advisable to select the air filter carefully, because it’s usually a mandatory requirement for most central-office equipment. Any air-filter media selected should at a minimum be Bellcore-compliant in flammability rating. Because the theoretical cfm needed is determined at this point, the air-flow velocity in linear feet per minute (lfm) can be calculated based on the surface area in square feet. Then the filter media is selected by studying the initial resistance versus the face velocity data/curves. Lastly, other features such as frames, which affect the total open area for airflow, and handles are identified. Besides filtering unwanted dust and airborne contaminants, air filters facilitate more-laminar airflow by their inherent air-straightening capabilities.

Following air-filter selection should be a study of the performance curves (flow versus static pressure) for air movers that supply significant airflow at higher static pressure. Armed with a good idea of the total system impedance and overall airflow (cfm), the system operating point can be estimated. This is a very critical factor that will influence the fan selection, and, therefore, the cooling solution.

Various 48-V dc fans available today can move as much as 100 cfm of air under moderately high static pressures. Ball-bearing fans are often the best choice because they are quieter, have a longer life at elevated temperatures, and are very cost-effective and widely available. Blowers deliver a higher airflow under high static pressures, but they’re much louder and quite expensive. They also tend to be larger and need more packaging space. Hence, they should only be selected after careful consideration of all these factors.

Regardless of the type of air-moving device chosen, it should support monitoring and control features like tach output and pulse-width-modulation (PWM) input. This popular feature lets the fan speed be controlled as a function of the shelf temperature. The shelf should be equipped with multiple temperature sensors that monitor different areas and provide feedback to the shelf manager. Depending on the temperature data, the shelf manager will control the individual fans, speeding them up or slowing them down to obtain optimum cooling where needed.

It’s recommended to validate the cooling solution by employing simulation tools based on computational fluid dynamics (CFD). Once the thermal design is validated, and the shelf is designed and built, thermal testing should be done to further validate the findings as part of qualification testing. Figure 2 shows the Elma 14-slot ATCA chassis instrumented with thermal load boards and temperature sensors as part of such a qualification testing. The chassis is cooled by a total of three plug-in fan trays. Each fan tray has dual 190-CFM, 48-V dc fans located under the cards for a total of 1140 CFM before accounting for pressure losses. Figure 3a and Figure 3b document the temperature and airflow rise in different slots as a function of time.

Testing was done with the load being steadily ramped up from 0 to 200 W per slot in steps of 50 W. At 200 W, it shows a temperature rise ranging from 14° to 28°, depending on the slot location. This data clearly demonstrates the need for more airflow in some slots, and can help refine the design by using air baffles to direct more air in those areas. Another option would be to set up the shelf manager to accelerate the cooling fans for that area.

Clearly, thermal management plays a major part in a successful ATCA platform design. Another area critical to the shelf design is EMC, which can be tackled by adhering to conventional design techniques. This subject matter is not analyzed here.

Topology Considerations The key topologies of the ATCA specification are Dual-Star, Dual Dual-Star, and Mesh. Naturally, the topology can have a great impact on overall system cost because the cards, backplane, and so forth, will be affected. For example, a Mesh topology can demand significantly more layers than a Dual-Star topology. With more point-to-point links, more layers must be added to achieve the signal routing. Therefore, Mesh topologies are often implemented in smaller systems (frequently in a horizontal-chassis configuration) where one can achieve high performance in less space.

In the PICMG 3.0 specification, the pc-board design has recommended values to ensure proper functionality over FR4 material. For example, the maximum backplane thickness is 6.35 mm, with no more than a 533-mm trace length. Although the design values are stringent, there are creative ways to ensure an optimal design. An experienced designer can improve the performance and reliability of the backplane, often cutting down the number of layers used and reducing costs. For example, in a Dual-Star or Dual Dual-Star configuration, the physical position of the hub slots on the backplane isn’t restricted. The position of the hub slots is critical because it will determine the maximum trace length on the backplane.

Bustronic and TreNew, backplane subsidiary divisions of Elma, hypothesized that placing the hub slots in the center of the backplane, could reduce the trace lengths and number of layers. To prove this, we performed simulation on a 14-slot Dual-Star configuration. The results showed that the maximum trace length could be cut in half. In turn, there was vast improvement in signal quality as the losses due to dielectric and skin effect were considerably smaller. The maximum trace length would be about 270 mm. Further, by placing the hub slots in the middle, intelligent routing solutions were implemented. This reduced the number of layers from 18 to only 12. Besides reducing cost, the smaller number of layers leads to a pc-board thickness of 3.2 mm, minimizing the stub influence and improving the signal quality.

Figure 4a shows a TDR profile for the worst-case stub. It represents a trace situated in the first signal layer under the connectors. The minimum differential impedance is only 85 ½. One may also notice that the measured backplane impedance is about 102 Ω. This very small tolerance of only 2 Ω results from strict conditions worked out using reliable and quality pc-board manufacturers. Measurements were performed using passive and active cards with real drivers that operate at 3.125 Gbits/s. The traces on the cards are 5 mils wide and 115 mm long. Figure 4b shows the eye diagram for the longest trace on the backplane situated in the worst-case layer.

The eye opening is about 509 mV—a strong result considering that the driver used for measurement requires at least a 200-mV opening. In a live system, performance is expected to be even better because additional noise introduced by the measurement cables and SMA contacts would be eliminated. Using standard FR-4 material wasn’t a problem. Employing various transceivers and SerDes devices, simulation and performance measurements showed that speeds of over 5 Gbits/s were reliable using FR4.

A good ATCA backplane design will consider issues such as hub slot placement, trace lengths, and intelligent routing strategy to optimize layer count and reduce costs. Further, these backplanes should be designed with plugging considerations in mind whereby shelf managers, fan trays, and headers receive signals from other Intelligent Platform Management Interface (IPMI)-enabled devices, etc.

Shelf Management Shelf management is a critical element in carrier-grade computing platforms. Lately, shelf managers developed for the PICMG 3.0 specification use the IPMI. To maximize efficiency, one can incorporate a shelf manager that individually controls the fans in the chassis. Figure 1 also shows the shelf managers incorporated in a 5U ATCA unit.

For redundancy, another important issue in carrier-class systems, the shelf manager can be designed as dual units in a hot-swap redundant mode. It would support redundant operation with an automatic switchover, where one shelf manager will be active, while the other is a backup unit.

Finally, an interface board can be used for direct shelf-manager plugging. The interface board can be located either inside or outside of the ATCA card cage, offering great flexibility in design.

Developing a chassis that meets specific customer needs can be challenging for any specification, let alone a new one like ATCA. However, by adopting a step-by-step design methodology and using a comprehensive specification such as PICMG 3.0 for ATCA, it can be a springboard for the successful development of a CGOS.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!