Fig. 1

Chips are expected to sport as many as 15 billion gates (BG) by 2020 (Fig. 1). Designs today are already using 2.5 billion gates. Simulation and emulation are critical to getting chip designs working properly before actual hardware is delivered since those chips need to be as bug-free as possible.

Emulation chips using a massive number of FPGAs have been the norm, but FPGAs aren’t necessarily the most efficient way to replicate the operation of a chip. With that in mind, Mentor Graphics has developed its own chips specifically designed for emulation and other EDA design chores; these have been packaged into the company’s latest Veloce Strato hardware (Fig. 2), which will scale to handle 15 BG before 2020 arrives.

Fig. 2

The Veloce StratoM box contains up to 64 boards populated by Mentor Graphics custom chips designed to run the Veloce Strato OS (Fig. 3), which in turn hosts Veloce Strato Apps—including chip emulation. The air cooled system is designed to use only 22.7 W/million gates, or about 50 KW per box. Multiple boxes will be connected using the high-speed Strato Link interface. Initially boxes will be connected directly with each other. Future systems may take another approach, such as using more links or switches. The key is to maximize the bandwidth. The network of boxes is linked to a conventional host server.

Fig. 3

The Veloce Strato OS provides a range of services to apps that are used with chip design, verification, and emulation. These include compiler services like system synthesis, system partitioning, and place and route (P&R). These functions could be done on a conventional server, but less efficiently. The Veloce Strato hardware provides a faster, more power-efficient means. The OS also provides core services like ICE, test bench acceleration (TBX; Fig. 4), and virtual ICE support. These can be used in conjunction with debug services including waveform, livestreaming, and software state replay technology.

Fig. 4

Although Veloce Strato is designed to eventually handle 15 BG, it is equally applicable to smaller designs. It is also one reason for having an OS to manage the resources so multiple chip designs can be handled at the same time.

The Veloce SoC custom chips allow compiles to run more rapidly because they are easier to target than an FPGA-based system. A 30-MG design can be run three to four times per day, taking only five minutes to compile compared to two hours for an FPGA-based target. The Veloce compiler uses an algorithm called timing resynthesis to build emulation databases, which implement a zero-delay semantic model for a design that is also free of any possible setup or hold time violations.

The Veloce StratoM hardware is already available in use. Most installations have a single box since that can easily handle many of today’s chip designs.