Dreamstime_vladimirtimofeev_70307007
6717fdfe1154b04edba7c893 Dreamstime Vladimirtimofeev 70307007

Voltage-Regulator IC Brings Vertical Power Delivery to Big AI Chips

Oct. 22, 2024
A deep dive into the challenges of AI power delivery and how startup Empower Semiconductor attempts to tackle them with its Crescendo family of voltage-regulator ICs.

The power demands of data centers are threatening to push the electric grid to its limits, as the latest AI and other high-performance workloads are driving power-per-rack specifications in data centers to over 100 kW.

But securing clean, reliable electricity to run these data centers is only one part of the AI power crisis. Tim Phillips, CEO of power startup Empower Semiconductor, said it’s becoming more of a challenge to cram it all into the rack, the server, the circuit board, the accelerator card, and the AI processor at the heat of it all. He explained that the voltage-regulator modules (VRMs) designed to deliver smooth and stable power over the “last inch” of the power-delivery network (PDN) are no longer cutting it, creating substantial lateral transmission power losses.

Empower tries to solve some of the challenges with traditional power electronics with its latest family of voltage-regulator ICs called Crescendo. The technology relocates the active power delivery directly under the GPU or other AI accelerator, slinging more than 3,000 A of current up through the PCB instead of across it.

“By moving to vertical power delivery, you can save up to 20% more power simply due to the location of the voltage regulator,” said Philips, also the company’s founder. “You don't have to pass through all that PCB resistance, and it's also much easier to cool when you have a thin chip under the board.”

While all of the leading players in power semiconductors are taking aim at the AI power dilemma, Empower is attempting to beat them out with what it calls its integrated voltage regulator (IVR). The IVR comprises the entire voltage regulator in a single chip. These DC-DC converters feature the company’s ultra-fast FinFET-based power technology, unique control architectures, and advanced power packaging to run at higher frequencies and change its output voltage faster and more accurately.

Empower said its speed can remove most of the passive components responsible for regulating power to AI chips under dynamic load conditions and replace the others with its high-frequency magnetics and wide-bandwidth capacitors. “Crescendo is the culmination of all that,” noted Philips.

Featuring 20X faster bandwidth than traditional DC-DC converters, the Crescendo platform can remove most of the magnetics on top of the PCB and the large number of capacitors under it to smooth out the power racing into the processor. By eliminating these capacitors, Empower said that it can move into the real estate directly under the AI silicon, reducing the prohibitive lateral transmission loss and the heat that comes with it.

Ed Prom Power Supplies
TechXchange

Power Supply Design

Examining the challenges and methods of power supply design.

As power-system designers struggle to manage rising impedances in the PDN, wider voltage gradients within the processor’s power pins, larger load transients, and many other challenges posed by AI chips, Empower believes it’s in the right place at the right time with the Crescendo platform. “The industry is sort of scrambling to get through this critical impasse of how to power the next generation of AI chips,” explained Philips.

The Complex Challenges of Supplying Power to AI Chips

Traditional power-delivery solutions are hitting a wall when it comes to high-performance AI silicon.

Today, central processing units (CPUs) in data centers siphon up to 300 W each from the power supply in the system. But the latest graphics processing units (GPUs), such as the H100 and H200 inside NVIDIA’s Grace Hopper superchip, are gulping down 700 W of power at peak times to handle AI and other computationally heavy workloads. As AI proliferates, NVIDIA is pushing the power envelope even further with its Blackwell B100 and B200 chips, which will run on as much as 1200 W each.

On top of that, these state-of-the-art AI processors run on very small supply voltages—approximately 0.7 to 0.8 V at the most advanced process nodes—which increases the current racing into AI chips.

Voltage regulators are used to thrust thousands of amps of current smoothly and efficiently into the point of load (POL) while very tightly regulating the operating voltage used by the processor’s cores, which is falling to 0.6 V to 0.5 V in the future. While the Grace Hopper superchip suctions up to 1,250 A of current per GPU, AI chipsets that consume 2,500 A or more are right around the corner, said Philips.

These DC-DC buck converters are placed in close proximity to the AI processor. The main reason is that as current travels over the PCB between the voltage regulator and the load, it runs into parasitic resistance on the power rails.

Even a small amount of resistance between the DC-DC converter and the processor’s pins can lead to large transmission—also called I2R—losses. According to Empower’s estimates, the Grace Hopper superchip wastes 80 W of transmission power per H200 GPU.

As the latest AI chips evolve to become even more power-hungry, the transmission of I2R losses in the PCB are rising out of control. These losses inevitably create heat that must be dissipated before it drags on the system’s performance. The PCB resistance on the power rails may also increase impedance, which can negatively impact the power integrity (PI) of the PDN and interfere with the smooth and stable delivery of power to the load.

While the voltage regulator is typically placed as close as possible load to limit these parasitics losses, it’s becoming more than a little inconvenient to fit all of the power electronics within the finite real estate on the PCB, said Philips.

Lateral Power Delivery: Too Slow for AI Power Solutions?

Today, most AI chips are configured for lateral power delivery. In this setup, several voltage regulators flank the north and south or east and west sides of the processor, each one supplying tens to hundreds of amps of current to load.

These multiphase DC-DC converters can handle the huge amounts of current used by AI chips, but they tend to operate slowly to do so efficiently, requiring more power stages on the PCB to supply the required current and power inductors to smooth it all out as it races into the processor, said Empower.

Phillips said the higher currents require more power stages to supply it and more magnetics to manage it all, resulting in bulky DC-DC converters that can occupy more than 50% of the real estate on the top of the PCB alone.

“It's all beachfront property,” he explained. “As power and current continue to grow, they have to add more and more rows of power management around the processor itself, and as you do that, the power gets less effective. The further away it gets, the more power it wastes and the worse it is when regulating voltages. So, there is a limit, and we are sort at that limit, where [there are] diminishing returns with these lateral power solutions.”

It makes more sense to relocate the active power delivery under the SoC to reduce the distance to the load and the resistance in the power’s path, which inevitably limits power losses and heat. But vertical power delivery tends to be impractical since voltage-regulator modules are typically too tall to fit between the circuit board and the heatsink, said Philips.

They’re also crowded out by multilayer ceramic capacitors (MLCCs) placed under the processor to act as energy storage and smooth out power delivery during switching.

These DC-DC converters must handle the highly dynamic nature of AI workloads, which can result in larger step changes in current also called load transients—or di/dt, for short. These transient currents can create stress within the PDN, and if the amount of current rushing into a high-performance AI chip or other load suddenly rises—for instance, to run at faster clock frequencies—voltage drop can occur. The sudden plunge in voltage—also called IR drop—could lead to supply voltage drops within the system-on-chip (SoC).

This is a problem because even slight differences in the supply voltage may cause non-trivial reductions in the processor’s performance or efficiency. Thus, the voltage regulator must be able to increase the output voltage as fast as possible to prevent voltage drop.

But since traditional voltage regulators are relatively slow, they must be paired with large banks of capacitors directly under processor, occupying the most power-sensitive real estate in the system and resulting in substantial lateral transmission power losses, said Philips.

These decoupling capacitors store small amounts of energy and release it at a constant rate to rectify the output voltage so that the processor can perform at its best. In addition to reducing the parasitic inductance and capacitance in the power’s path, these passive components dampen ringing and other high-frequency noise to make sure the output voltage is unaffected by ripple that can be present in power-supply signals.

Inside Crescendo: Empower’s Vertical Power Delivery IC

Empower is trying to overcome the obstacles to vertical power delivery with its Crescendo platform.

Since it can respond instantly to power changes, Crescendo reduces the requirements for energy storage both above and beneath the PCB. Empower said it can effectively eliminate the high-frequency decoupling capacitors under the SoC, replacing them with voltage regulators in thermally enhanced packages that—at 1 to 2 mm tall—can fit within the space constraints under the circuit board. These IVRs are paired with Empower’s wide-bandwidth capacitors and high-frequency magnetics to fill any gaps in energy storage.

To stay a step ahead of the rising power requirements of AI, Crescendo is modular and scalable. According to Empower, it can integrate up to 50 voltage regulators into a 12-V input DC-DC converter that’s able to deliver more than 3,000 A to the processor above.

“These devices are directly connected to each other, and they are very dense so the solution can fit right within the footprint of the AI chip,” said Philips. “Then you get direct vertically coupled power where there is close to no impedance between [the IVR] and the GPU, so we can supply the current on demand. To do that you have to have high bandwidth and be able to change currents and voltages very quickly, which we can do.”

Empower stated Crescendo comprises several core technologies that are all about speed. This includes its CMOS power technology and high-speed control architectures to its high-frequency magnetics, wide-bandwidth capacitors, and advanced power packaging.

The IVR at the heart of Crescendo is based on its FinFast power technology. FinFast transforms the FinFET transistors in most high-performance chips into high-voltage, high-current power cells. Based on a complementary metal-oxide semiconductor (CMOS) process, FinFast can handle several hundred amps of current efficiently and run at the FinFET’s very fast switching speeds. “That had never been done before,” said Philips, adding that most of the company’s more than 100 patents pertain to CMOS power.

The startup uses its unique power-control architecture to control the current sensing, level shifting, switching, and other fundamentals of voltage regulation at these high frequencies while limiting noise. Crescendo’s IVR is a DC-DC converter that runs on the 12-V bus voltage used for PCB power distribution and reduces it to a 3- to 4-V intermediate voltage before stepping it down to the processor’s core voltage, typically less than 1 V.

While Crescendo removes many of the high-frequency capacitors in the system, it doesn’t eliminate them all. To replace the remaining ones, Empower also rolled out a family of high-performance silicon chips called ECAPs that can serve as wide-bandwidth capacitors with minimal resistance (ESR) and inductance (ESL). In most cases, these silicon capacitors are placed directly on the processor’s substrate to serve as bypass capacitors that can reduce noise generated by the switching outputs of the power device.

Empower said it’s also integrating the silicon capacitors into the Crescendo platform, placing them around and in between the voltage regulators to keep voltage bounce down. High-frequency magnetics are also being integrated into Crescendo, integrating the company’s air-core power inductors inside the power package, in the silicon, and/or on the board depending on the requirements for energy storage. The company’s advanced packaging is the key to connecting all of these power-supply components at such high frequencies, said Philips.

By co-designing and co-optimizing all of these technologies, Empower said it can more closely integrate the power-supply components in Crescendo, creating a denser and more efficient method for AI power delivery.

Powering Up: The Pivot to Vertical Power Delivery

Empower is one of several power semiconductor companies tackling the challenges of AI power delivery.

Most of the movers and shakers in the power industry are racing to roll out voltage regulators that can use vertical power delivery to handle the huge amounts of power used by AI silicon. However, in most cases, these companies are stacking the power devices, the magnetics, the capacitors, and other components of the power supply on top of each other, resulting in towers of power electronics that are up to 10 mm tall, said Philips.

The challenge, he explained, is figuring out how to fit these power modules under the circuit board. “The problem is these components are very tall, and the current is forced to travel all the way up and down them at slower frequencies,” which inevitably increases the parasitics losses in the system and adds to the difficulty of dissipating heat. “This also prevents them from removing the capacitors because they haven’t changed the fundamental bandwidth of the solution; they have just made it smaller by stacking it.”

By removing most of the capacitors under the PCB and replacing them with its voltage regulators, Empower claimed Crescendo supports up to 5X higher total solution density at the same power level. Since it’s not as tall as other voltage regulators, the IVR is also relatively easy to cool and enables placement closer to the heatsink. “We have a close to direct connection between the voltage regulator, the heatsink, and the board,” said Philips.

“This new technology is breaking the mold since power chips have traditionally operated at less than 1 MHz," stated Philips. “We can operate significantly faster than that to change the game of where the power can be placed.”

These high speeds also enable tighter voltage margining, reducing power burn. Typically, high-end processors run at higher core voltages so that they can tolerate sudden drops in voltage due to load transients or ripples in the power supply. Empower said Crescendo can respond to these situations in nanoseconds instead of microseconds, reducing voltage drop. As a result, the voltage margining can be decreased, directly saving power.

“You get better transient performance with lower power loss, which means better thermals, and which means that the AI silicon can run faster and give you more throughput. All that can lead to higher performance per watt in the system,” said Philips.

Ed Prom Power Supplies
TechXchange

Power Supply Design

Examining the challenges and methods of power supply design.
About the Author

James Morra | Senior Staff Editor

James Morra is a senior editor for Electronic Design, where he covers the semiconductor industry and new technology trends. He also reports on the business behind electrical engineering, including the electronics supply chain. He joined Electronic Design in 2015 and is based in Chicago, Illinois.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!