Electronic Design

  
Reprints     Printer-Friendly    Email this Article    RSS        Font Size     What's This?


[Technology Report]
High-Availability RTOSs Deliver Five-Nines Reliability
To work, multiprocessor systems and hot-swap hardware require high-availability RTOSs.

William Wong  |   ED Online ID #3595  |   October 29, 2001


New real-time operating-system (RTOS) enhancements make 99.999% availability and real-time application requirements achievable. Applications like transaction processing, process control, communi- cations switching, and air-traffic control are just a few examples where any downtime cannot be tolerated. Such companies as Monta Vista, OSE Systems, QNX Systems, Red Hat, Lynuxworks, and Wind River Systems have added high-availability services to the list of modules that can be incorporated into an RTOS.

The technology of high-availability systems isn't new. IBM, Sun, Microsoft, and others have done it for years. Custom embedded systems have often utilized high-availability techniques through customized software instead of standardized OS support.

High-availability hardware isn't new either, but this type of hardware such as RAID disk and tape support is showing up in more embedded and real-time systems. Standard CompactPCI systems, like those from Force Computers, provide hot-swap board support. Likewise, network interconnects, including Ethernet and InfiniBand, give developers a choice of implementation methods. Today, off-the-shelf hardware can provide high-availability support with an off-the-shelf RTOS.

High-availability hardware systems available generally feature:

  • Hot-swapping capability. This is available in computer boards like CompactPCI boards and disk and tape drives.
  • Multiprocessor links. Popular buses like InfiniBand and CompactPCI as well as networks like Ethernet include this feature.
  • A RAID (redundant arrays of hard disks) architecture as found in disk and tape drives.

It's important to recognize the roles redundant hardware and hot-swapping play in a high-availability system (see "Hot-Swapping Is Only Part Of The Hardware Story," p. 44). A number of hardware technologies are available to implement high-availability systems.

Software support for high-availability systems is cropping up in a number of places (Fig. 1). Now, even an application programming interface (API) exists for CompactPCI.

Checkpointing, transaction support, and application heartbeat support are just some of the features be-ing used with real-time systems. But the APIs aren't always standardized across vendors because each OS implements a heartbeat support in a different fashion.

Checkpointing is the ability to save enough information from a process to restart it if it fails. Heartbeat support is the act of finding when a process fails.

Modularity is still the key aspect of high availability in an RTOS. One example can be seen in a partitioning of high-availability services that closely match an OS, in this case, Wind River's new VxWorks Foundation HA, which builds on the company's VxWorks AE RTOS (Fig. 2).

Other examples include Lynuxworks Lynx/HA and Monta Vista's High Availability Framework, which add high-availability support to Linux-compatible and Linux operating systems respectively. These additions have a modular construction similar to VxWorks Foundation HA.

Hardware may steal the limelight in numerous circuit designs, but high-availability hardware won't work without the correct software. More importantly, high-availability applications need to operate regardless of the kind of hardware available in the system. In particular, applications must continue working with other applications in the system, even if one application fails due to errant coding, a lack of resources, or other software-related problems.

In some cases, software failover support can be provided transparently. That's how many message-based systems operate.

In general, a high-availability system should have the following software services:

  • Heartbeat support for each server and each application.
  • Event management capability for change notification.
  • Alarm management for error handling.
  • Transactions capability for check-pointing and rollback/restart.
  • Clustering for server management and applications links.
  • Reliable storage support for RAIDs and for journaling file systems.

With QNX, applications communicate with each other using a messaging system that is part of the RTOS' core services. The QNX message system supports transparent message-based services independent of its new high-availability support. The QNX link manager can detect a failed application and redirect messages to an alternate application (Fig. 3).

The link manager can utilize alternate paths between applications and start up a new application if necessary. Changes are handled based on an application's description of a link. QNX uses messaging for all major services, and messages move transparently across node boundaries (Fig. 3, again). Of course, this redirection works equally well between applications on the same node.

Some RTOSs add messaging capabilities as part of their high-availability services. For example, Lynuxworks Lynx/HA includes message-oriented middleware that uses unicast, broadcast, and multicast transmissions for notification of system events. Lynuxworks also includes CORBA-compatible quality-of-service options.


<-- prev. page     [1] 2     next page -->

Reprints   Printer-Friendly  Email this Article  RSS    Font Size   What's This?


  • Network-On-Chip Tools Arrive for The Masses
  • Tackling System Design Challenges Through Early Verification
  • ESL Tools Take Center Stage As Designers Move Up
  • Parasitic Extraction Tool Targets Next-Generation Custom ICs
  • Synopsys Jumps Into ESL-Synthesis Pool
  • Verify Control Systems Before Committing To Hardware
  • You're Using How Many FPGAs?
  • Tool Up For The FPGA Blitz
    1) Build A Smart Battery Charger Using A Single-Transistor Circuit
    (183 views today)
    2) Hot Hands For Some Cool Rock: Motion Sensing Meets Audio Engineering
    (170 views today)
    3) GPS-Derived Grandmaster Clock Delivers Ultra-Precise Time And Frequency Sync
    (90 views today)
    4) What's All This Transimpedance Amplifier Stuff, Anyhow? (Part 1)
    (86 views today)
    5) Downconverting Mixers Lower Power Consumption While Improving Performance
    (71 views today)
    ALL TOP 20



    POST YOUR COMMENTS HERE
    Name:

    Email:
    Your Comments:

    Enter the text from the image below


    Please refresh the page if you have trouble reading this text.

    Search Electronic Design
         
      
     
    Web Seminar
    Sponsored By:
    Title: Read Pacing: A Performance Enhancing Feature of PCI Express Gen 2 Switch Devices
    Speakers: 
    Date: 07/01/08
    Register: 

    Electronic Design Europe Electronic Design China EEPN Power Electronics Auto Electronics Microwaves & RF
    Mobile Dev & Design Schematics Find Power Products Military Electronics EE Events Related Resources