As CPUs and graphics processors (GPUs) evolve, many of their design features
are beginning to look remarkably similar. As a result, many of today's most
common workloads will soon have a choice about where to execute. All the
major hardware providers have told users to expect processors that feature increasingly non-uniform and
complex memory hierarchies, rapidly increasing core (and thread)
counts, and the integration of specialized acceleration units.
These new processor designs won't be friendly to legacy code bases optimized for single-threaded, uniform memory systems, or, for that matter, to programmers without the time or expertise to create
tuned, processor-specific code. If we want to fully utilize these new hardware designs, something
needs to change about the way we write software.
For the last 25 years, developers have been used to
programming traditional CPUs—single-core processors with integrated floating-point units, shared memory, and a
large uniform cache. As a result, software environments—
compilers, debuggers, application, platforms, and libraries—
have been created to support programming and running applications on these types of CPUs.
CHANGES AHEAD
Several different trends are now converging to render the existing software infrastructure obsolete.
Luckily, cutting-edge software coming to market will let developers harness the power of these next-generation processors, without requiring a radical change in their working habits.
While engineers are already struggling to meet the software
demands for quad-core processors, the spectre of massively multicore designs looms. At the Intel Developer Forum last fall, the
chip giant first announced a prototype design of "Polaris," an 80core processor with programmer-managed distributed memories
and non-uniform caches.
Add to this the increasingly tight integration of GPUs and CPUs,
demonstrated in AMD's Fusion project, as well as the growing
movement to leverage GPUs as math coprocessors, and we can
see that the obstacles facing engineers in relation to processor
design will only increase exponentially.
In addition, applications designed for today's traditional single-threaded CPUs could be rendered meaningless if they can't scale to
increasingly sophisticated architectures, wasting organizations' precious resources. Today's software simply isn't ready for where AMD,
IBM, and Intel are bringing us.
Encouragingly, one of the inherent
differences between traditional and
multicore processors—the parallel architecture of multicores—is inspiring new
software approaches that enable engineers to not only take advantage of the
increased power offered by the increase in
the number of cores per processor, but also
create applications that can scale to hundreds
or even thousands of cores.
Stream programming is a data parallel programming method compatible with distributed,
explicitly managed memory that offers vastly superior productivity, performance, and efficiency compared
to outdated serial programming models that aren't
designed to cope with the vastly increased parallelism
seen in these new processors.
Using a stream programming approach, developers with
traditional skills can quickly and easily build applications using
existing tools such as gcc, gdb, and Intel compilers leveraging C,
C++, and even Matlab conventions and skills.
A NEW TECHNIQUE
With this model, developers can easily exploit the full potential of industry-standard multicore processors, programming a wide variety of hardware platforms with a
single application-programming interface. As these hardware platforms evolve, a developer's application binary will continue to run
on these new platforms, maximizing their return on software
development investments.
While the challenges around multicores and the converging
trends associated with the new architectures are daunting to
engineers, new and innovative software technologies such as
stream programming are assuaging concerns. This pioneering
software holds the most promise in fully exploiting the power
and performance of these converging designs, allowing engineers and organizations to propel massively multicore processors out of the realm of research and supercomputing and into
general-purpose computing.