Multicore chip designs, large symmetrical multiprocessing (SMP)
systems, and clustering can bring many processors to bear on an
application. But without proper software,
they're simply large collections of processor cores and memory. And conventional
serial programming languages don't
make handling an expansive suite of
computing elements any easier.
The current approach for tackling
transaction-oriented Web traffic is to
distribute a large number of conventional sequential applications across multiple cores. Azul does this for Java
applets with its 48-core Vega 2 chip
(see "Multicore My Way,"). This approach works well for clustering current applications that don't
use a large part of the whole set of
processors, but it doesn't work as well
when trying to scale an application that
has minimal multithreading.
The problem is that conventional programming languages assume a sequential programming model. This wasn't a
problem when most systems employed a
single processor. Though it's an advantage in a complex environment, multithreading normally is used sparingly so it
doesn't overload the processor or operating system. Operating-system support of
lightweight threads has increased the use
of threads, but applications with more
than a dozen threads are unusual.
One alternative is to use a system like
the message passing interface (MPI) currently used by applications running on
supercomputers and large clusters. MPI
still requires explicit thread definition, yet
it lets threads communicate easily with
each other. The approach works, but it
doesn't scale well unless the applications
are carefully designed and deployed.
Another alternative is specialized runtime systems like Intel's Thread Building
Blocks (see "Multiple Threads Make
Chunk Change" ). It targets
applications that manipulate large chunks
of data such as arrays. In this case, the
number of threads typically matches the
number of cores available, with each
pulling from a common work queue. This keeps everything running, but it targets a limited area of parallel programming.
LANGUAGES AND LANGUAGE EXTENSIONS
Parallel programming languages
turn conventional sequential programming
semantics on end so the default is parallel
execution rather than serial. The compiler
and runtime environment must optimize
the parallel execution, relieving the programmer of this chore. Sequential execution must be specified explicitly since it is
now the exception rather than the rule.
Parallel programming languages tend
to still be in the research side of development. Sun's Fortress addresses a range
of applications and programming issues,
but it assumes a system with many cores
and good communication or shared memory between cores. Threads are synchronized with atomic blocks of code.
Some approaches attempt to extend
an existing language like C. Cilk, a multithreaded parallel programming language
based on C, retains C's serial semantics.
It also adds a handful of keywords and
parallel semantics. And, it supports speculative parallelism.
Parallel programming languages face a
number of challenges, from technical
issues and optimizations to social issues.
It's unusual for most programmers to
switch programming languages, and
learning a new programming paradigm
typically requires a significant investment
in time and effort.
Parallel programming languages work
well in many environments, but they may
not be as useful as they are in heterogeneous configurations of cooperative
cores. These application-specific core
combinations are growing more common
as cores become more numerous. In this
case, the interconnection of systems is a
major part of a designer's job.
Taking advantage of parallel hardware
environments will require a major shift in
programming. But software will continue
playing catchup with hardware.
MPI Forum • www-unix.mcs.anl.gov/mpi
Sun Microsystems • www.sun.com
The Cilk Project supertech.lcs.mit.edu/cilk