Today’s computers are going
multicore where performance matters.
Whether it’s for a desktop or server, more
cores are showing up in the compute
engine and graphics rendering, providing
users with everything from more lifelike
video to solutions for computationally
complex problems.
This year, three products stood out.
Intel’s six-core Xeon pushes the envelope
for the typical operating platforms such
as Linux and Windows. The Tesla C1060
opens Nvidia’s multicore GPU (graphics
processing unit) to programmers to do
more than just graphics. For graphics rendering,
AMD’s ATI Radeon 4870 x2 puts
two multicore GPUs on a single board,
which reduces overhead for communication
when the two chips cooperate to
render a single video stream.
SIX-CORE XEON: MULTICORE WORKHORSE
Intel’s (www.intel.com) “Dunnington”
7400 series Xeon chip delivers four or six
cores in a single package (Fig. 1). It’s the last of
the Penryn generation of Intel processors. However, it will be
the workhorse until the 45-nm, eight-core Nehalem arrives
next year.
The devices in the 7400 series use Intel’s 45-nm Hi-K (hafnium-
based, hi-k metal gate) technology. The chip contains
1.9 billion transistors, including a shared 16-Mbyte L3 cache.
It’s compatible with exiting sockets that can handle earlier
quad-core Xeon chips, so plenty of motherboards out there can
corral this workhorse.
It can be power efficient, too. The six-core low end sips a
cool 65 W even with a 1066-MHz frontside
bus. The chip is designed to be used
in systems with up to 16 CPU sockets for
a total of 96 cores.
The 7400 series employs the latest
virtualization technology, since these
chips are destined for server farms that
are running lots of virtualized clients. It
supports Intel’s FlexMigration technology,
which facilitates use of older client
images as well as movement to Nehalem
in the future.
TESLA C1060: GPU DOES
MORE THAN GRAPHICS
Nvidia’s (www.nvidia.com) Tesla C1060
contains a GPU with 240 processing
cores (Fig. 2). The GPU employs a
single-instruction, multiple-task architecture
(see “SIMT Architecture Delivers
Double-Precision TeraFLOPS” at www.
electronicdesign.com, ED Online 19280)
that’s equally useful in graphics applications
and streaming computation on large
amounts of data.
The Tesla C1060 is an impressive computing
platform. But when combined with
the CUDA (Compute Unified Device
Architecture) development environment,
it becomes a best-of-class system.
CUDA lets programmers use an
extended version of C to develop
applications that run on Nvidia's latest
GPU platforms, including the popular
GeForce line. Some applications will run slightly
faster while others may improve by two orders of magnitude.
It all depends on how much the application can take
advantage of the SIMT architecture. CUDA supports multiple
GPU environments like the Tesla S1070 with four C1060 class
boards containing 960 cores.
ATI RADEON HD 4870 X2:
TWICE THE GRAPHICS
The R770 graphics processing unit (GPU) shows up twice
in AMD’s (www.amd.com) ATI Radeon HD 4870 X2 board
(Fig. 3). Using a pair of GPUs isn’t new. In fact, AMD’s ATI
Crossfire technology has been used regularly to link a pair of
boards to double graphics performance (see “ATI X1950XTX,”
ED Online 14198). But the Radeon HD 4870 X2 does it with
just one board.
Putting two GPUs on the same board
boosts performance even more than linking
a pair of boards because of the tighter
integration. A pair of boards can bring
even more processing power to bear.
AMD has opened its GPU to programmers
as well. This opens possibilities
to use the extra cores for chores other than
graphics rendering, and there are plenty
of applications in gaming where the HD
4870 X2 excels.