Premium Content

New Signal Chain Resources from Texas Instruments:

Improve Your Card Power System's Reliability

In addition to choosing the proper dc-dc converters, pay careful attention to the power-management design, and make sure you thoroughly qualify your system.

Date Posted: March 17, 2005 12:00 AM
Author: David Cooper

SYSTEM RELIABILITY IMPROVEMENT
Most anecdotal power reliability problems customers see can be traced back to weaknesses in system-level reliability—the component application and system qualification—rather than the fundamental MTBF of the components themselves. For example:

  • The production version of the card draws more peak current than expected, causing the voltage to drop under extreme conditions.
  • The power system shuts down unexpectedly in the field (nuisance trips).
  • The card fails at the customer site, but when it is returned for repair, no faults are found (NFF).
  • Sequencing between rails depends on component tolerances and doesn't always meet the needs of the ICs.
  • Sequencing during shutdown wasn't considered during the design.
  • The power system cannot deliver full load at extremes of input voltage and temperature.
  • Power modules overheat due to restricted airflow when the card is installed in the equipment.
  • Although these types of problems can occasionally occur even in a well-designed system, the likelihood can be reduced significantly through careful design and thorough qualification testing. The table takes a closer look at these specific problems and offers tips on how they can be avoided.

    Obviously, good power-system design is a complex, multifaceted subject that touches on the entire product and its environment. Don't underestimate the task's complexity. Furthermore, although the initial focus is on efficient power conversion, remember that the power-management functions share equal importance in achieving a good power-system performance.

    MTBF IMPROVEMENT
    Following three fundamental methods can improve the MTBF of any system. Use fewer components, make the components more reliable, and make the system function even if components fail. Each can play a part in improving power-system reliability, together with comprehensive qualification testing.

    FEWER COMPONENTS
    Often, component count can be reduced in the power-management system. A dedicated power-management IC can replace a large number of discrete components used for monitoring and control, such as comparators, op amps, optocouplers, and RC time delays. At the same time, a power-management IC can offer much better performance than a discrete solution, improving system reliability by accurately reporting marginal performance while avoiding nuisance trips.

    For example, the Potentia PS-2610 measures each output rail voltage every 40 µs using an 8-bit analog-to-digital converter. The PS-2610 employs digital filtering to allow for fast response to a real OV condition while preventing false OV or UV shutdown due to voltage spikes.

    A typical POL contains fewer internal components than an isolated brick, and the failure rate can be significantly lower. The manufacturer's quoted failure rate for a typical POL is about 200 FITs (equivalent to an MTBF of 5 million hours), whereas a typical brick is about 500 FITs (which is an MTBF of 2 million hours). On the other hand, a POL usually has lower output power than a brick, so you may need more of them to meet your total power requirement. Of course, reliability is only one of many factors when choosing power converters. But by considering reliability early in the design, you can make the best tradeoff for your application.

    MORE RELIABLE COMPONENTS
    Component reliability is influenced primarily by the qualification and quality-control processes used in manufacturing, as well as by the stresses applied in the application. Power-conversion reliability can be improved with a modular approach, using standard off-the-shelf dc-dc converters as components in your design. These units, which are built in high volume using an automated process with full quality control, offer excellent performance and reliability. You will avoid the need to calculate component stresses within the power converter, because the design is optimized during the manufacturer's in-house qualification.

    Similarly, plan your power-management design around a dedicated power-management IC rather than a general-purpose device, like a gate array or microcontroller (MCU). A power-management design using an MCU or gate array requires extensive testing under both normal operation and fault conditions. This is to ensure that logical errors in programming don't cause incorrect behavior. Conversely, the dedicated power-management device's behavior is already fully tested and qualified by the manufacturer. Only the operating parameters (voltage levels, time delays) require programming.

    FAULT TOLERANCE
    To dramatically improve system reliability, design the system to be fault-tolerant. In the ideal case, an available backup instantly takes over for any component failure, leaving system performance unaffected. The term availability expresses the proportion of time for which the system performs as expected. The provision of backup components is called redundancy. In a practical system, there are limits to the degree of redundancy that can be achieved, and availability can never reach 100%. Through careful design, redundancy can provide almost complete protection against any single fault, and it can achieve 99.999% (five nines) availability or better.

    Most redundant systems achieve redundancy by duplicating entire cards. For example, two identical control-processor cards can be used in a shelf, either of which can take control if the other fails. The 48-V distribution system also is duplicated, with dual 48-V feeds to each card from independent circuit breakers. If any individual circuit breaker trips, the cards still receive uninterrupted power through the second feed. In most cases, it's not considered beneficial to duplicate the on-card power system itself, since any card failure (power or otherwise) means simply replacing the card.

    For effective redundancy, it's vital to report all component failures immediately to the operator for maintenance before the backup fails. In the power system, this implies not only comprehensive monitoring of all output-voltage rails, but also monitoring of fuses and power feeds to detect any loss of redundancy. Additional monitoring such as input-current measurement and thermal sensing can provide advanced warning of overload conditions and further improve reliability.

    While today's power systems are more complex, high reliability is achievable. Minimizing component count can improve the failure rate and yield a high calculated MTBF. Also, with effective power management, you can implement features that improve overall equipment reliability. Remember that reliability is much more than just MTBF. Carry out thorough qualification testing of your power system to ensure it meets equipment requirements under all conditions.

    POWER-SYSTEM PROBLEMS AND SOLUTIONS
    Power problem Suggested solutions
    Card draws more current than expected
  • Maintain detailed system power estimates and update frequently during development. Include the effect of software updates.
  • Build in enough margin to allow for power increases during development.
  • Nuisance trips
  • Do not use unnecessarily short time delays for fault detection. In typical systems, about 1 ms is suitable for OV and 50 ms for UV.
  • Use adequate decoupling in fault-detection circuits.
  • Carry out thorough system transient testing on the product, including ESD, EFT, and lightning tests as applicable.
  • Cards are returned NFF
  • Consider including a fault log as part of the power-management system to improve diagnosis.
  • Sequencing depends on component tolerances
  • Do not use time-based sequencing. Instead, voltage interlocking between rails helps to guarantee correct behavior.
  • Incorrect shutdown sequencing Review IC specs to determine whether shutdown sequencing is required. If so, include it as part of the power-management system.
    Cannot deliver full power at extremes
  • Design for worst-case combination of voltage and temperature.
  • Remember that input current is highest at minimum input voltage, particularly for battery systems.
  • Test at extremes, and include margin testing of all rails.
  • Overheating in system
  • Carefully characterize the airflow in your system, including variation at extreme conditions.
  • Design for worst-case load, with adequate derating.
  • Follow the power-module supplier's guidelines.
  • Provide alarms for overtemperature and fan failure.
  • Make sure system testing represents the real environment and covers extremes of temperature, load, and airflow.
  • microcontrollers
    Part Inventory
    Go
    powered by:
     

     
    You must log on before posting a comment.

    Are you a new visitor? Register Here
      There are no comments to display. Be the first one!