As the semiconductor industry traverses
through the deep-submicron process nodes,
each plateau along the way carries its own
signature bugaboo arising from physical
effects. At 180 nm, timing-closure issues got
everyone's attention. At 130 nm, signal
integrity was the topic of the day. At 90 and
65 nm, though, power integrity and leakage
are weighing on designers' minds. We now pack so many active
elements onto such a small slab of silicon that power density
has reached near-critical mass. For example, according to Srikanth Jadcherla, founder and CTO of ArchPro Design Automation, a die measuring 1 by 1 cm with power consumption of 1 W dissipates the equivalent of 10 GW per square kilometer, or 25 GW per square mile.
Along with enormous increases in power density comes
the physics of the submicron realm. With narrower feature
sizes come thinner gate insulators, and that translates into
leakage power. Leakage across gates is a condition in which
the gate never shuts entirely off. Rather, it continues to consume power even though it's in a nominally passive state. At
the 65-nm node, leakage can constitute more than 40% of
the overall power consumption of a system-on-a-chip (SoC)
or ASIC ().
Unfortunately, leakage has a symbiotic, and positively
reinforcing, relationship with temperature. Leakage begets
heat, which begets more leakage, which begets even more
heat. And, in worse-case scenarios, thermal runaway can
ensue, leading to potential fires and/or explosions in enduser systems.
Thus, heat is indeed an enemy that must be faced head-on.
Fortunately, designers can turn to a number of tools and
methodologies for prediction and management of thermal
effects. In this article, we'll explore some of the thermal-analysis
methods that help unearth problem areas. We'll also discuss
some best practices in the thermal-management arena.
LEAKAGE IS A KEY
In addition to its exponential relationship with temperature, leakage is at the root of more subtle, yet no less pernicious, effects. Chief among these are
problems brought on by electromigration, which are exacerbated by the higher current densities.
Then there's the broader issue of thermal variation across
a given die's planar dimensions—even in the Z dimension
between metal layers. Not only do disparities exist in temperature at a great many points on and within the die, but
those variations are far from constant. As major functional
blocks turn on and off, switching activity will have an ongoing effect on the die's thermal characteristics.
THE PERFECT STORM
There is, in fact, an interconnected maze of effects brought about by temperature variation
that involves timing, signal integrity, and reliability ().
As mentioned, temperature has a positive feedback loop with
power and leakage. But it also affects timing by weakening
the driving capability of devices. Higher temperatures mean
an increase in the passive resistance of interconnects, which
in turn increases delays.
The effect of temperature on IR drop and electromigration
is accomplished primarily through Joule heating, or self-heating of the interconnects. This is another result of the increased
resistance of the wires due to elevated temperatures. The circuit's electromigration lifetime degrades exponentially with
rising temperatures. In IR-drop terms, that increased resistance on the power and ground grids leads to larger IR drops,
meaning more power consumption.