Electronic Design

  
Reprints     Printer-Friendly    Email this Article    RSS        Font Size     What's This?


[Technology Report]
The Multicore Era Seeks A Parallel Paradigm
Scalability, simpler debugging, and easier coding are essential to developing a successful parallel-programming approach.

William Wong  |   ED Online ID #18159  |   February 28, 2008


Parallel programming is hard. But debugging it is even harder. Unfortunately, taking advantage of multicore solutions like Intel’s 80-core TeraScale prototype will require some type of parallel-programming technique (Fig. 1).

The first challenge is to find parallelism that can be exploited. The next is using a tool to exploit the parallelism. Another goal is bug-free code. Parallel programming opens the door to a range of more complex bugs, though, and time becomes even more critical. Finally, there’s the issue of targeting the host platform with these tools.

At this point, generic solutions don’t exist because of the range of multicore hardware. Tools primarily target only one class of hardware or even one vendor’s hardware. Programmers typically push these jobs off to the operating system or runtime. Eventually, though, parallel-programming constructs will make it into mainstream programming languages. Either way, developers will need multicore solutions to take advantage of performance improvements, since singlecore scaling is no longer an option in pushing the limits.

LET THE OPERATING SYSTEM DO IT
Pushing the job of managing coarse-grain parallelism onto the operating system is a common task and easy to do. It works well if there’s a large number of programs, or if those programs are taking advantage of multiple cores. This requires no modification of the applications, but it’s of less value if there isn’t enough programs to exploit the hardware.

Server environments typically can have program loads that use the target hardware. Likewise, embedded application designers can latch onto virtual-machine (VM) products like Trango’s Hypervisor, Green Hills Software’s Integrity, VmWare’s namesake, and KVM or Xen on Linux to manage multicore solutions. These tools allow for better management and debugging of programs and systems in addition to providing features like load leveling.

VM architectures potentially open up other avenues for programmers. Thin operating systems or programs running alone in a VM may be given access to features previously restricted to the operating system, such as virtual memory management and peripheral access.

Virtual memory management could enable programmers to manage memory and interprocess and intra-application communication more effectively. For multicore utilization, communication is key to good use of the system. The big question is whether programming languages or runtimes will take this approach.

LET THE RUNTIME DO IT
After VMs, runtimes are the most common method for exploiting multicore environments. Platforms like Intel’s Threading Building Blocks (TBB) require developers to explicitly use exposed function calls to utilize the runtime.

This approach forces developers to determine the type and utilization of parallelism in an application and meld it with the runtime. In turn, the runtime will also need to manage parallelism. The functional interface can help narrow the scope for finding parallelism that may put the onus on the programmer to use the right function.

Usually, the interface is implemented to the runtime strictly through function or class definitions, though customizing a compiler offers advantages as well. TBB employs a typical interface, much like the following definition for the parallel_do function:

template<typename InputIterator, typename Body> void parallel_do( InputIterator first, InputIterator last, Body body );

In general, parallel processing deals with data or control parallelism. The above definition takes advantage of TBB’s C++ support and C++ templates. Specifically, TBB addresses data parallelism over large data sets, such as matrices or streams of data.

Microsoft’s Concurrency and Coordination Runtime (CCR) (see “Software Frameworks Tackle Load Distribution” at www.elecronicdesign.com, ED Online 18813), which was released with Microsoft’s Robotics Studio (see “MS Robotics Studio,” ED Online 16631), also uses a functional interface and addresses control parallelism. In this case, CCR helps optimize asynchronous communication between threads that may be distributed among multicore platforms or even across networks.

As with any runtime, programmers must account for a mindset and an underlying architecture. They work with it all the time, since applications rarely are completely standalone or written solely by a single programmer. Consequently, there’s at least some level of black-box isolation within an application. On the other hand, complex frameworks like TBB or CCR require a good understanding of the underlying architecture.

Continue on Page 2


<-- prev. page     [1] 2 3     next page -->

Reprints   Printer-Friendly  Email this Article  RSS    Font Size   What's This?


  • Cadence Comes At Power From Two Perspectives
  • Collaboration Results In First IEEE 1149.7 cJTAG Semiconductor IP Core
  • Engineering A Hall Of Famer
  • Yield Enhancement Software To Aid Solar Cell Fabs
  • Audio Engine Codec Library Expands With Dolby Pro Logic Additions
  • Accellera Rolls New Version of Analog, Mixed-Signal Standard
  • 45-nm Via-Programmable ASICs Add High-Speed I/O Transceivers To Feature Mix
  • Verification Evolves Into Lean, Mean Bug-Stomping Machines
    1) Build A Smart Battery Charger Using A Single-Transistor Circuit
    (203 views today)
    2) Power Architecture Group Defines High-Speed Serial Trace Spec
    (143 views today)
    3) Evident Technologies Debuting Nanocrystal LEDs
    (142 views today)
    4) TI Working To Develop IEEE 1149.7 2-Pin Debug Spec
    (136 views today)
    5) White LEDs Clear Another Brightness Bar
    (130 views today)
    ALL TOP 20



    POST YOUR COMMENTS HERE
    Name:

    Email:
    Your Comments:

    Enter the text from the image below


    Please refresh the page if you have trouble reading this text.

    Search Electronic Design
         
      
     
    Web Seminar
    Sponsored By:
    Title: Read Pacing: A Performance Enhancing Feature of PCI Express Gen 2 Switch Devices
    Speakers: 
    Date: 07/01/08
    Register: 

    Electronic Design Europe Electronic Design China EEPN Power Electronics Auto Electronics Microwaves & RF RF Design
    Schematics Find Power Products Military Electronics Featured Vendors EE Events Free Design Resources