According to data from Gary Smith EDA, Mentor’s Catapult C, which synthesizes optimized RTL from ANSI C++, enjoys market-share leadership in algorithmic synthesis. With a growing infrastructure behind it, Catapult C is seeing continual upgrades and improvements. In the 2007b release, Mentor added closed-loop power analysis and optimization capabilities.
By leveraging the original C++ testbench that accompanies the design description, Catapult C uses the design’s switching information to generate highly accurate power estimations at RTL or gate level with any power-estimation tool, such as those from Atrenta, Sequence Design, or Synopsys. Through microarchitecture optimizations, the tool can deliver power savings of up to 30%, the company claims.
With the 2008 release of Catapult C, Mentor raised the bar, adding a number of compelling features. For one, distributed pipeline control can now be implemented, enabling users to configure each block in their designs as independent streaming engines while retaining classic handshaking between blocks (Fig. 3).
Previous versions of the tool implemented pipelining through the use of a top-level pipelining controller, which mandates flushing the pipelining in the absence of data on the input. Distributed control eliminates this requirement, allowing for autonomously pipelined blocks with extremely high performance.
Catapult C reads in pure ANSI C++. But when it comes to building virtual platforms for verification, most designers prefer to work with SystemC. SystemC is the only language that can serve as a wrapper for ANSI C++ even as it facilitates the specification of block-to-block communications and explicit concurrency. It also lets designers specify communication interfaces.
To enable generation of SystemC models, Mentor integrated Catapult C with its Vista Model Builder, which leverages the Open SystemC Initiative’s recently ratified TLM 2.0 standard. Using the ANSI C++ algorithm, the Vista Model Builder creates a compartmentalized SystemC model and automatically generates the model’s transaction-level interface. The resulting model can later be annotated with timing information and, eventually, power-consumption data. Models can be created at a coarse level of detail or at finer levels at the expense of greater simulation time.
Synfora improved its PICO Extreme and PICO Extreme FPGA algorithmic synthesis tools to achieve higher performance and smaller area than the tools’ previous generation. PICO Extreme, an optimizing compiler that transforms a sequential, untimed C algorithm into highly efficient RTL, was enhanced so designers could create and analyze hardware designs more effectively. Moreover, they include QoR improvements in terms of area, throughput, timing, and timing correlation, as well as user feedback improvements.
In the latest version of PICO Extreme, advances in scheduling algorithms enable the compiler to optimize registers in a design. Benchmarks reveal area improvements of 5% to 20%. Sophisticated analysis of variable loop bounds, combined with a new approach to handling early exits from loops, provides performance boosts in the range of 10% to 30% on complex designs.
Achieving high productivity on complex designs requires synthesis tools to provide sophisticated analysis, feedback, and debugging capabilities. Therefore, users can better understand the performance and area bottlenecks in the design. The enhancements to PICO Extreme 08.03, including those to the reporting and feedback capabilities, improve the ability to analyze throughput bottlenecks, provide greater visualization and reporting of the hardware cost, and allow automatic detection and feedback on potential deadlock scenarios.
Creating Virtual Platforms
Moving from the algorithmic phase into a virtual platform for simulation often involves hardware acceleration. Vendors such as EVE have struck up partnerships with numerous ESL companies to improve their handling of models at various levels of abstraction.
“For one, Sony asked us to work with them. We ended up developing platforms where we would take charge of RTL blocks in simulating. Others would be simulated in their environment with their tools,” says Lauro Rizzati, general manager of EVE-USA.
EVE’s ZeBu FPGA-based emulation systems require synthesizable RTL, which is compiled and mapped onto the emulators’ multiple FPGAs. But EVE does offer a means of interfacing higher-level models through transactors based on the SCE-MI standard.
Called ZEMI-3, the transaction-level modeling methodology lets designers quickly create custom transactors for protocols that don’t fall neatly within the bounds of standard types. In doing so, ZEMI-3 technology raises the level of abstraction for hardware debugging. Custom transactors compiled with ZEMI-3 work with various transaction-level environments, such as those from CoWare and Synopsys.
EVE developed the ZEMI-3 technology from the compilation technology it acquired with Tharas Systems. “We optimized their behavioral synthesis for developing transactors,” says Rizzati. “It’s no longer general behavioral synthesis but targets transactors only.” Meanwhile, EVE continues to develop and offer off-the-shelf transactors, such as an AXI Master/Slave transactor and a PCIe Gen 2.0 16x transactor.