[Technology Report]
The Multicore Era Seeks A Parallel Paradigm
Scalability, simpler debugging, and easier coding are essential to developing a successful parallel-programming approach.
Putting an additional level between the programmer and the base system sometimes can help, too. This is the case with Microsoft’s PLINQ (Parallel Language Integrated Query) technology, which is an extension of LINQ. PLINQ and LINQ are designed to simplify access to data sources such as SQL servers.
The difference between PLINQ and LINQ and SQL or other interfaces like XPath and XQuery is that PLINQ is a data-source agnostic, type-safe query language that’s embedded in a number of Microsoft’s .NET-based languages (such as C#). Since database use is ubiquitous in many applications, improving parallel performance can significantly boost performance.
Again, finding parallelism is a cooperative process with programmers needing to know what functions to utilize. The advantage for programmers is that they only need to learn a single query language regardless of the data source. PLINQ was designed to maintain the programming model provided by LINQ while offering additional parallel functionality.
Integrating LINQ/PLINQ functionality within the compiler has advantages in the sense that syntactic changes are easier. It wreaks havoc on portability, though, limiting the solution to Microsoft platforms. New approaches like this also mean fighting conventions like SQL with new syntactic ordering such as:
var q = from x in Y where p(x) orderby x.f1 select x.f2;
As with most programming syntax, one person’s sugar is another’s salt. Still, being able to completely embed the solution with a programming language can simplify a programmer’s job of learning a system, and parallel constructs won’t be utilized if they’re hard to use or remember.
Of course, playing with syntax and semantics does allow compiler and systems designers to add features that would otherwise be hard to incorporate by staying strictly within the bounds of a current programming language definition. For example, PLINQ adds the idea of lazy evaluation in the form of infinite streams.
Using a stream within a query lets the system access only those items needed to complete the current transaction. A simple example would be a stream query that has results being returned one at a time. If the stream already supplied the data when a result is requested, then the application continues. Otherwise, it waits and the calculation of the next stream element occurs.
PLINQ provides a range of parallel-processing enhancements, such as the ability to run multiple threads on a partitioned data space as well as pipelining requests. Of course, each enhancement has its own issues, such as whether physical or temporal locality of data is critical to the application or the operation being performed.
Likewise, partitioning queries can have a major impact on the resulting performance and efficiency (Fig. 2). As the number of cores, threads, and communication methods increases, so does the number of options. And regardless of whether you’re using TBB, CCR, or something else, it’s difficult to get the costs right.
The number of cores in a system may be large, but runaway computation can waste such a resource. This may not even be apparent from a user’s perspective, since a result may be delivered in a timely fashion. But developers will need more insight, including more time-oriented diagnostics.
LET THE LANGUAGE DO IT Mainstream languages like Basic, C, C++, C#, and Java include multithreading support. However, all thread and data management is explicit. They form the basis for the parallel runtimes, but runtime designers often perform some interesting feats that most programmers would rather forget or not even want to learn about.
Research projects like Unified Parallel C add to the syntax and semantics of an existing language. Still, programmers loathe incorporating new changes unless they can see widespread adoption, or if a particular platform they must use supports the tools.
Another issue is the existing infrastructure and semantics for most of the mainstream languages. For example, shared memory is the norm. Yet it’s a concept that doesn’t scale well, while pointers and references are central to languages like C or Java.
Several different approaches, such as using futures for lazy function evaluation, are similar to the PLINQ infinite stream example noted earlier. This approach is commonly used in functional programming languages like Miranda and Haskell, though these examples definitely aren’t mainstream.