Premium Content

New Signal Chain Resources from Texas Instruments:

Programming To Survive Multicore: Race Conditions

Date Posted: February 28, 2008 12:00 AM

Interestingly, the right way to do this is to not be so dependent on a single global sum. Use the implicit synchronization (we can do this without locks and without atomic operations) of a “reduce” operation—at least not explicit ones in our code. (Any locks or synchronizations are hidden from us and implicit in the reduce operation of TBB.) The elegant and most efficient parallel coding is with the parallel_reduce operation:

class Body {
    long m_sum, m_sumsq;
    Body() : m_sum(0), m_sumsq(0) {}
    operator() (const blocked_range<int> &r) {
      for (int i = r.begin(); i != r.end(); ++i) {
         m_sum += d\[i];
         m_sumsq += d\[i] * d\[i]; } }
    void join(Body &rhs) { m_sum += rhs.m_sum; m_sumsq += rhs.m_sumsq; }
}
long d\[10000];
fill_data(&d);
Body body;
parallel_reduce (blocked_range<int>(0, 10000), body, auto_partitioner());

CAN WE ESCAPE DATA RACES SOMEHOW?
In practice, the best advice today is to use abstractions such as TBB, OpenMP, or threaded libraries, as they tend to reduce the likelihood of subtle data race conditions. But they don’t eliminate the possibility. Data races are still easy to create, depending on the constructs you use.

General programming can lead to data races, but more specialized coding such as a purely data parallel construct cannot. The use of parallel_reduce() in TBB, for instance, steers us away from the sort of coding that would cause a data race. That’s because data races come from sharing data, and anything purely data-parallel does not need to share data. That means you can reduce the amount of code that can cause data races, and maybe we can use that in the future as a programming technique.

Looking down the road, transactional memory gets a lot of interest these days. It’s an interesting research topic with the alluring goal of doing for shared data what transactional databases have done for databases in terms of ease of programming and determinism. Many issues still prevent transactional memory from being a tamed beast, but it is very interesting to study.

LEARNING MORE: DOWNLOADS AVAILABLE
For writing production code now, TBB can be downloaded for free from http://threadingbuildingblocks.org. The Intel Thread Checker, which can be evaluated for free at www.intel.com/software/products/eval, doesn’t use compile time instrumentation. Therefore, it produces fewer false positives than static checkers. It also can debug programs that use libraries that come pre-compiled.

However, the Intel Thread Checker only can detect issues in code that’s exercised while running the tool. Also, the time to test is related to the program’s runtime. In practice, the dynamic methods seem easier to use and more useful than static checking, which produces too many false error messages and cannot operate with precompiled code.

Designers also can take advantage of two implementations of note for software transactional memory: a functional language known as Haskell (www.haskell.com) and a C++ compiler from Intel at http://Whatif.intel.com.

IN CONCLUSION
Non-deterministic programs aren’t fun to debug, and they’re too easy to write. But all is not lost. With proper diligence—and the proper tools when the diligence fails to be perfect—we can survive. And, we can always work on more innovation to help us avoid the traps that cause the problems.

multicore
Part Inventory
Go
powered by:
 

 
You must log on before posting a comment.

Are you a new visitor? Register Here
    There are no comments to display. Be the first one!