Bridge the Gap Between Hardware and Software in Power-Supply Design and Reliability
This article is part of the Power Management Series: Power Sources Manufacturers Association
Greg Evans, CEO of WelComm Inc. and PSMA Marketing Committee member, contributed to this article.
For nearly two decades now, digital implementations of power-supply designs have been implemented in high-volume applications ranging from data-center equipment to consumer-grade devices. The semantics of what are implied by a “digital implementation” of a power supply can be a major topic of debate in and of itself. This can range from a kind of digital wrapper to capture and share information from a power subsystem (i.e., usage analytics that can drive decision-making fed back into system and/or power supply control) to a fully, digitally driven control and hybrid solutions in-between (Fig. 1).
Most any stakeholder ever working on the development, debugging, testing, qualification, and/or system integration of such digital power-supply solutions has encountered myriad costly, negative schedule impacts attributed to any number of reasons. The challenges of reliably and pragmatically deploying digital power-supply solutions was the major impetus for the formation of the Power Sources Manufacturers Association (PSMA) Reliability Committee in July 2017.
Undertaking the Industry Challenge
The PSMA Reliability Committee’s work, co-chaired by Tony O’Brien of Cisco Systems and Brian Zahnstecher of PowerRox, began as a special project focused on the capture of issues, best practices, and executional challenges associated with digital power solutions with the assistance of a third-party subject matter expert (Dr. Hamish Laird of ELMG Digital Power). The Committee’s objective was to produce high-quality documentation and assemble it in a meaningful, digestible format for consumption by an extremely diverse set of global, industrial and academic stakeholders.
The project culminated in the creation of the 166-page, nine-chapter “2019 PSMA Power Supply Software/Firmware Reliability Improvement Report” (Fig. 2). This article will delve into the process and the ultimate objectives of the study—the reliable implementation of digital power-supply control.
Moving the Needle
Most people can quickly internalize why it is important to have high reliability in power supply designs as all electronics are relegated to useless paperweights without power. On the other hand, being able to answer key questions such as “Why is this initiative critically important to the industry?” and “Why now?” may not be quite as intuitive.
Much of the PSMA Reliability Committee membership (especially at project onset) came from the computing and data-center original equipment manufacturer (OEM) world. These were the markets that originally (and sometimes begrudgingly) started implementing fully digital power solutions in significant volumes with substantial industry stakeholders.
This was triggered by the demand for ever-increasing density and efficiency requirements as the hyper-scaled world was coming online, and the cost of power started to not only dominate operating expenses (OPEX), but would even dwarf the capital expenses (CAPEX) to dominate the total cost of ownership (TCO).
Advanced topologies such as the LLC resonant forward converter and non-isolated multiphase dc-dc solutions (i.e., processor voltage regulator, memory power, etc.) provided an excellent compromise of the typical size/weight/power (SWaP) key metrics. However, they were very challenging to implement with traditional analog control and therefore dictated a major push toward digitally controlled power supplies, particularly for front-end power supply applications—ac-dc supply, rectifier, silver box, bulk power supply, etc.
With the relatively quick transition of the latest, high-end data-center power supplies, it seemed power-supply-related bottlenecks in the system development and release schedules had some relation to this “digification” of the power supply. The root causes would vary greatly from disconnects between hardware (HW) and software (SW) teams to understanding new stakeholder roles in the power-supply development process to an industry need to evolve the team building and program management processes.
This led to everything from specification gaps to unforecasted cost/testing, which all translates to financial and project schedule negative consequences. As stated in the report’s intro "Software engineering best practices are being adopted for digital power electronics control for power supplies to ensure reliability does not suffer with the transition to digital control. Some of these best practices are relatively new, evolving, and unfamiliar in the power electronics world."
The 2019 PSMA Power Supply Software/Firmware Reliability Improvement Report
The report is broken into sections with a logical flow that’s analogous to an “Eight Disciplines Problem Solving (8D)” process methodology.1 It starts by breaking down the typical issues, challenges, and points of failures. These are complemented by the use of many real-world case studies and design experiences to facilitate making the concepts relatable and consumable by a variety of highly diverse readers with both technical and non-technical backgrounds.
That foundational background information is then translated into best practices across multiple design and management domains. These are further broken down and characterized in the context of large impact to specific stakeholder areas, such as design engineering, functional/validation/qualification specification generation, program management, and even more business-oriented stakeholders. For instance, effort was put into linking technical and operational development project challenges with assessments of impact on financial risk and time-to-market, where appropriate.
Much emphasis was put into identifying and breaking down the communications barriers between SW and HW engineering, which turned out to be the true root cause at the core of many other issues. This made the ability to drive systematic, industry-level improvement in HW/SW design engineering highly dependent on addressing these operational/communication gaps.
The report unapologetically takes this on with a deep focus on the program-management (PM) processes, even down to the level of strategies for team member selection. PM is a crucial stakeholder area and enabler of a multidisciplinary team dynamic conducive to the successful deployment of systems taking advantage of all that digital power solutions have to offer. The report tackles this head on with statements like: "One key cultural difference for management to address is the perception that software development culture has little history of testing, while power electronics relies too heavily on testing. Management of the differences in team cultures is key to success in digital power electronics."
The closing sections seek to apply the learnings and express them in the context of common, industry vernacular and metrics. Because of the frequent overlap in subject matter (though examined in different contexts), a reader is assisted by many linkages and references to appropriate sections within the report. A full report preview summary with its table of contents and header (including Introduction) can be found at https://www.psma.com/publications.
Ongoing Work of the PSMA Reliability Committee
The PSMA Reliability Committee is a vibrant and active group of about 30 participants. The committee recently sponsored the first-ever Reliability Industry Session at APEC 2020, and is already making plans for one at APEC 2021.
At this very moment, the committee is busy working on a new special project (with third-party consultant and subject matter expert Bob White of Embedded Power Labs) to serve as a follow-on to this report. This new project puts far more of a focus on the communication bus reliability requirements in power supplies and the systems/networks with which they interact. This new report is targeted for release sometime later in 2020. More information on this activity and anything related to the PSMA Reliability Committee can be found at https://www.psma.com/technical-forums/reliability.
The Reliability Committee is always seeking to grow its membership with new members wishing to identify and contribute to opportunities that add value in the areas of quality and reliability. This charter casts quite the wide net and therefore requires an extremely diverse knowledgebase, so stakeholders from many areas and industries are needed.
Join PSMA and Help Keep This Work Going
The PSMA increases their members’ tech knowledge and developments related to power sources. They educate the electronics industry and academia, along with government and industry agencies, about the applications and importance of power sources and conversion devices. Their mission is to integrate the resources of the power sources industry to more effectively and profitably serve the needs of the power sources users, providers, and PSMA members.
PSMA is made up entirely of volunteers for industry who have an abiding desire to advance the interests of our industry and to growing their personal involvement. A modest annual membership fee enables member companies to access timely and essential committee reports as a free benefit of membership, discounts on registration fees at APEC and other PSMA-sponsored events, discounts on selected technical publications, and many other valuable benefits. All of these benefits accrue to members companies, and for the individuals who choose to serve on a PSMA Committee, it offers the opportunity to network with others while aiding in our important work. Learn more about joining PSMA.
This article is part of the Power Management Series: Power Sources Manufacturers Association
Brian Zahnstecher is PSMA Reliability Committee Co-chair.
Reference
1. Contributors, Wikipedia. Eight disciplines problem solving. Wikipedia, The Free Encyclopedia. [Online] [Cited: March 1, 2020.] https://en.wikipedia.org/wiki/Eight_disciplines_problem_solving.