The continuing evolution toward higher-performance
microprocessor units (MPUs) has revolutionized the design
of computers large and small. This evolution has generally
followed Moore’s law—the semiconductor industry doubles
transistor density every two years while increasing performance
with each new generation. Increased performance has
contributed to a rise in microprocessor chip power dissipation
and power density.
An example of the heightened power dissipation can be
found in the 2007 edition of the International Technology
Roadmap for Semiconductors (ITRS). It says there is now a
maximum power dissipation of approximately 120 W due to
package cost, reliability, and cooling cost issues.1
Starting with this ITRS power-dissipation statement, the
4.7-GHz MPU clock frequency is projected to increase by a
factor of at most 1.25 times per technology generation. Power
dissipation is estimated to reach 200 W/cm2 by the end of the
2008 ITRS timeframe. MPUs that continue using existing circuit
and architecture techniques would exceed package power
limits by a factor of nearly 4 by the end of 2020.
SOME OPTIONS TO TRY
One approach to cutting power dissipation is to reduce powersupply
voltage, which is driven by reduced transistor channel
length and the reliability of gate dielectrics. Even with
lower supply voltage, total power consumption will continue
to increase, driven by higher chip operating frequencies, the
higher interconnect overall capacitance and resistance, and the
increasing gate leakage of exponentially growing and scaled
on-chip transistors.
MPUs must control their operating temperature, which
affects reliability as defined by their failure rate, or useful system
life in failures per 106 hours (Fig. 1). The Arrhenius reliability
model states that failure rate is a function of the temperature
stress—the higher the stress, the higher the failure rate. Typically,
each 10°C rise in temperature causes a 50% increase in
the failure rate. Conversely, cutting the operating temperature
by 10°C reduces the failure rate.
Thus, failure rate and its inverse, mean time between failures
(MTBF), is one measure of thermal-management effectiveness
in electronic systems. In dealing with thermal problems,
the electronic system designer will have to enter the domain of
the packaging and thermal design engineer.
Besides reliability and performance issues, a microprocessor’s
thermal management also involves economic and
mechanical challenges. Cost is obviously an important consideration.
Equally important are size considerations when trying
to accommodate increasingly higher-power microprocessors,
especially in laptop computers.
“Most of today’s high-performance microprocessors use
an area array, flip-chip interconnect scheme to connect the
active (circuit) side of the die to an organic or ceramic package
substrate. The package substrate is either soldered to the
computer motherboard through a grid array of solder joints or
has pins that are inserted into a socket that is soldered to the
motherboard (another alternate socket is the land grid array
socket where socket fingers contact pads on the surface of the
package),” says R. Mahajan, et al.2
“In all cases, when dealing with high cooling demand, and
in attempting to establish cooling envelopes, a
reasonable first-order assumption is that the bulk
of the heat will have to be removed from the inactive
side that is farther away from the motherboard.
Given the limited airflow and the presence
of significant amounts of lower thermal conductivity
organic material on the active side, this is a
reasonable first assumption,” Mahajan continues.
“There are two thermal design architectures,”
says Mahajan (Fig. 2). “Architecture I is
one where a bare die interfaces to the heatsink
solution through a thermal interface material
(TIM) and Architecture II is one where an integrated
heat spreader (IHS) is attached to the
die through the use of a TIM and the heatsink
interfaces to the IHS through a second TIM. Architecture I has a lower profile compared
to Architecture II and is often used
for microprocessors in mobile and handheld
computers. Architecture II is typically
used for microprocessors in desktop and
server applications.”
HEATSINKS
The most widely used thermal-management
device, the heatsink, transfers heat
by conduction from a microprocessor to a
specially constructed metal plate. The most
common heatsink type has many metal
fins. The metal’s high thermal conductivity
and large surface area transfer the heat
from the microprocessor to the heatsink
and then to the surrounding air. The heatsink’s
ability to transfer heat depends on its
material, geometry, and overall surface heat
transfer coefficient.
Heatsink material is usually aluminum
or copper, which is more expensive and
heavier than aluminum. Compared with
copper, aluminum has the advantage of
being more easily formed and shaped into
different geometries. Heatsinks with fins
come in many forms: extruded, cold forged,
die cast, milled, bonded, and folded. Some
heatsinks consist of a series of round pins
force-fit into a baseplate.
A key parameter in using a heatsink is
the thermal resistance of the associated
microprocessor package, which is its ability
to conduct heat away into the surrounding
environment. A design goal is a low thermal
resistance value for a given amount of
power, which allows the microprocessor’s
junction to operate at an optimum temperature
and provide a longer useful life.
Continue on Page 2