[Design Application]
Verify Systems With Functional Synchronization
Loosely Coupled Tools Improve Performance While Maintaining The Functional Integrity Of The System
In today's system-on-a-chip (SoC) world, there is an increasing need to bring system-level verification into the design process as early as possible. Teams that succeed in integrating their systems, in tandem with the development of silicon, will win the time-to-market battle by capturing essential, early design wins.
Every SoC device includes a processing element, which introduces embedded software modules into the system verification environment. Meaningful system verification requires the integration of those software modules into the overall product verification environment.
This article explores an innovative approach to coupling the embedded software development environment, with its high-speed execution and familiar debugging tools, to an ASIC emulator. Through loose coupling, both tools achieve maximum performance, while maintaining the functional integrity of the system. The use model for each tool is also preserved.
The method used to accomplish this loose coupling is called functional coverification. Figure 1, the example for this discussion, is a high-level block diagram of the hardware/software system. The embedded microcontroller and its bus-interface hardware are assumed to be functionally correct, and are not the subject of this verification project. They are replaced by a virtual communications conduit connecting the embedded software system to the ASIC hardware-under-test.
In hardware emulation, one can considerably reduce the emulation gate counts by not modeling processor circuitry. Note, however, that an instruction-set simulator may be used to act as the processor in the software verification tool.
Multiple Tools To better understand a loosely coupled verification system, one must have some appreciation of the coverification problem, as well as the techniques used for synchronizing disparate tools. Multiple verification tools include event-driven simulators, instruction-set simulators, standalone behavioral models, ASIC emulators, and embedded software simulators.
Whenever these tools are employed in a large system simulation, with each one modeling some portion of that system, there is an inherent synchronization problem. The models running on one tool may not exhibit adequate performance in relation to components running on other toolswhich is a must for maintaining functional integrity.
For example, assume that an event-driven simulator (Verilog or VHDL) is modeling an I/O port at 20 cps, and the driving software is running on an instruction set simulator at 1 Mcps. Suppose that, in the real application, the processor and port both run at 100 Mcps. If the software writes a command to the output port, which produces a result in a status register on the next clock, it is okay to access the status port soon after issuing the commandwithout checking if the status is indeed ready.
But in this simulation, the design breaks. The status port is accessed before it is ready, because the hardware is performing 500 times slower than the software. The software system runs erroneously ahead, creating this artificial problem. Clearly, if both verification tools ran their models at the same speed, either at 20 cps or 1 Mcps, the problem would be avoided. When the tools naturally run at different speeds, as in this case, some cross-tool mechanism must be used to keep them in sync. Several techniques are available: time synchronization, cycle synchronization, functional synchronization, or some variation of these.
Time synchronization maintains a common time base across all tools. It is extremely compute-intensive, and rarely allows performance of more than a few hundred cycles per second, even when simulation accelerators or emulators are used. In the above example, time synchronization would indiscriminately slow the software system to 20 cps. Even when processing code doesn't access the I/O system, this would happen-an unacceptable solution.
With cycle synchronization, no tool can advance to the next clock cycle until all the tools have completed the current cycle. This technique supports higher speeds, in the 1000 to 10,000 cps range. As with any cycle verification technique, however, problems arise when asynchronous behavior is present (multiple clock domains, unclocked logic, etc.) across the interface. And, performance is still inadequate.
The functional synchronization technique, by comparison, features free-running underlying tools, with no tool-based synchronization. For the verification to maintain integrity (no one tool runs erroneously ahead in verification time), the system-under-test itself must maintain functional interlocks. Because synchronization occurs only when necessary, as determined by the system-under-test, this technique offers the highest performance potential, with speeds approaching the maximum for each tool. No common-time base is maintained across the tools.
The system being verified must provide the functional interlocks. Thus, careful attention is given to finding the correct interfaces to partition the system among the tools. The interfaces between the hardware running on the emulator, and the software running on the workstation, are one example of this. Hopefully, the selected interfaces are already interlocked in the design, or can easily be modified to add the required handshaking, which results in a more robust design.
Thus, it seems that functional synchronization with emulation offers the best solution for synchronizing these disparate tools. One of the first steps in applying this technique is to find the proper functional interface. Fortunately, functional interfaces naturally exist between hardware and software subsystems. They occur at the I/O port level, where software commands are transformed into hardware transactions. These interfaces are designed to be at least partially interlocked by the chip designer, because response times between the software and hardware modules, in actual applications, cannot be precisely predicted by either component.
For example, assume that the software writes two words to a data port in response to an interrupt from the hardware. If the hardware assumes that the first word write would occur exactly n1 clocks after the interrupt, and the second word would be written exactly n2 clocks later, the design would be defective. Software systems do not offer that preciseness. However, it is reasonable, and often necessary, to place a maximum limit on the response time.