Remember when thermal analysis meant getting your
prototype back and deciding if you might need to throw in
a couple of heatsinks and a fan for good measure? Try that
approach now and you may find yourself in deep and without
a paddle. After all, heat can hamper electrical performance and
ultimately reduce mean-time between failures.
Back in my engineering heyday, I never put much thought
into thermal analysis because it just wasn’t necessary, and I
know I’m not alone. But with semiconductors dissipating
greater amounts of power (and therefore heat) per area than
ever, coupled with continued system shrinkage over time, more
system engineers who don’t perform thermal analysis are winding
up in hot water.
“A lot of functions that used to be spread across several
components are now contained in a single component,” says
Dave Rosato, lead product manager for Ansys. So now, the
heat density is much greater for those SoC-type (system-on-a-chip)
components.
“The rules of thumb that engineers used to design a board
five and 10 years ago just don’t apply to today’s designs,” continues
Rosato. “Years ago, the board was ignored as a heat transfer
path. Now you must account for all heat transfer paths.”
The “simple solution” is to perform thermal analysis sooner
in the design cycle. How soon? At the least, you should perform
a rudimentary analysis just after the
block diagram stage. You’ll need to download
the datasheets for the components you plan
to use and get a feel for future challenges
from a thermal standpoint.
If that analysis points to potential trouble,
you need to consider using some thermalanalysis
simulation software and possibly
even working with a materials company to
determine if it can engineer something that
will suit your design parameters.
“DANGER, WILL ROBINSON!”
I own a laptop that recently stopped working
because the fan integrated with the heatsink/
heatpipe combination no longer gets
powered correctly. Even with the case open
and plenty of cool air all around, the unit
won’t power up and the “Fan error” message
appears before it even performs the typical
power-on self-test (POST).
It immediately shuts down when it senses the fan isn’t powered
on. The assumption is that the average laptop user won’t
pop the case open in a nice air-conditioned room, and thus
the CPU will experience the often fatal “thermal runaway.”
The downside to this approach is that my entire system is shot
because the fan (or the underlying power source to the fan)
isn’t working.
This is a good example of a laptop manufacturer deciding
that under no circumstances is the CPU to ever run without
forced air blowing on the attached heatsink. This design was
engineered with these requirements because the laptop designers
knew that improper thermal management meant imminent
doom. In fact, Intel and AMD take this problem very seriously.
For example, “If the external thermal sensor detects a catastrophic
processor temperature of 125°C (maximum), or if
the THERMTRIP# signal is asserted, the VCC supply to the
processor must be turned off within 500 ms to prevent permanent
silicon damage due to thermal runaway of the processor,”
says the January 2008 edition of the datasheet for Intel’s Core
2 Duo Processor.
“Maintaining the proper thermal environment is key to reliable,
long-term system operation. A complete thermal solution
includes both component- and system-level thermal management
features,” according to the datasheet.
“To allow for the optimal operation and
long-term reliability of Intel processorbased
systems, the system/processor thermal
solution should be designed so the
processor remains within the minimum
and maximum junction temperature (TJ)
specifications and the corresponding thermal
design power (TDP) value,” it notes.
“Caution: operating the processor outside
these operating limits may result in
permanent damage to the processor and
potentially other components in the system,”
the datasheet concludes.
Continued on page 2