What you'll learn:
- What is a reset domain crossing?
- What is the best way to verify resets?
- The role of static reset analysis.
Resets are one of the most fundamental aspects of electronic design. The ability to initialize state elements in an application-specific integrated circuit (ASIC) or field-programmable gate array (FPGA) to known values is important for many reasons.
When the chip is first powered up, an externally applied power on reset (POR) ensures a predictable startup sequence. The chip itself may initiate an internal hardware reset to clear problems or return to a known state. System-on-chip (SoC) designs with embedded processors often generate internal software resets as well. Scenarios in which internal resets are required include:
- Fatal error condition recovery
- Recovery from deadlock or watchdog-timer expiration
- Transient fault such as a particle hit or crosstalk
- Individual block power on from sleep or off state
- Initiation of a test or debug sequence
Careful design of reset logic is essential for proper chip operation. When the design contains multiple asynchronous resets, this task becomes much harder. Any portion of the chip with a unique reset signal is defined as a reset domain, and any signal traversing from one reset domain to another creates a reset domain crossing (RDC). Improper handling of RDCs can lead to serious problems. Some of the common beliefs about resets and RDCs are actually myths, so understanding this is essential for designing chips correctly.
1. Chips only have a few resets.
This was true at one time and may still be true for some small designs. However, for most chips, the number of resets has grown rapidly. Power management often relies on the ability to turn portions of the design on and off. Resets are usually part of responding to safety-compromising conditions for such applications as autonomous vehicles and industrial controllers.
A recent survey by Synopsys showed that the average chip design now has 40-50 reset domains, and large SoCs may have hundreds. The more reset domains, the more RDCs and the greater chance for design errors.
2. RDCs are a subset of CDCs.
The challenges with multiple asynchronous clocks and the associated clock domain crossings (CDCs) are well known, with many documented solutions. Since many designs have more clocks than resets, CDCs have tended to dominate the design process.
There’s a tendency to assume that RDCs are essentially the same, so that proper clock design will avoid reset issues. This isn’t the case. Clocks and resets interact, and CDCs and RDCs share some characteristics, including metastability as a failure mode. However, reset errors and metastability can occur on signals within a single clock domain. Special focused analysis and design techniques are needed for reset signals and RDCs.
3. Reset errors can be fixed in software.
Embedded software plays a key role in SoCs, providing a lot of the functionality and controlling many aspects of the hardware. It’s tempting to assume that any hardware bugs escaping to silicon can be fixed in software. RDC errors can’t be resolved this way, especially when internally generated hardware resets are involved. No software manipulation can stop metastability; the only solution is an expensive chip turn. All aspects of resets must be checked during pre-silicon verification, well before tapeout.
4. Resets can be verified in RTL simulation.
Simulation of the register-transfer-level (RTL) design within a testbench is the primary method for pre-silicon verification. This method can detect many types of functional errors early in the development process, when they’re relatively easy to fix with minimal resource hit or schedule impact.
Unfortunately, this isn’t the best way to verify RDCs. RTL simulations typically model almost no sub-cycle timing, so glitches or metastability that could cause reset problems are unlikely to occur. In addition, internally generated hardware resets may be hard to control from the testbench.
5. Resets can be verified in gate-level simulation.
Running simulation on the post-synthesis netlist with estimated delays back-annotated on all signals provides a high degree of timing accuracy. While some types of reset errors may be detected, the same reset control limitations from the RTL testbench are still an issue. In addition, gate-level simulation is very late in the design and verification process, where errors are much more difficult to fix. Tests run much slower than RTL, errors are much harder to debug, and making fixes is likely to delay tapeout.
6. Reset metastability can be modeled in simulation.
Even in gate-level simulation with full timing, most metastability issues will be missed. Some verification teams have set up simulation regressions where resets and clocks vary over time. However, in actual chips, asynchronous resets and clocks can vary continually.
Any attempt to model that behavior with a series of discrete timing variations is by its nature incomplete. Repeating the complete regression suite many times with different reset alignments consumes a huge amount of time and resources. No matter the effort, no amount of simulation can ever guarantee that all RDC bugs will be found before silicon.
7. RDC analysis is trivial.
A static-analysis tool is the best way to find reset and RDC issues. It provides exhaustive verification, something simulation can never do, and can run very early in the project, well before the testbench is ready. This is hardly a trivial task. The tool that performs this analysis must have powerful engines able to handle more than a billion gates and very complex designs. Capacity for a few million gates isn’t nearly enough.
Tracing resets, finding RDCs, and identifying bug risks is only the start of the process. The static-analysis tool also must be able to determine whether the designers used proper techniques to avoid errors on every RDC deemed at risk. These design techniques include holding data so that it doesn’t change at the wrong time and using control signals or gated clocks to keep incorrect data from being read.
8. A few levels of reset analysis is sufficient.
In earlier generations of reset static-analysis tools, the number of sequential levels traced for each reset and RDC was limited. This is highly likely to miss reset bugs in today’s designs.
What’s required is “any depth analysis” that detects RDC issues regardless of the sequential depth between the registers in different reset domains. This is especially useful when data paths in FIFOs and pipelines use registers that have no reset capability. A RDC can occur between the source register and the destination register, so the reset static-analysis tool must be able to trace through any number of intervening non-resettable stages to check for errors.
9. Static reset analysis is hard to control.
The user must provide several types of information for the analysis to find all reset bugs accurately, including clock timing, reset timing, and any reset sequencing. It would indeed be a burden on users to have to specify all of this in a proprietary format for use in just one tool.
Fortunately, the industry-standard Synopsys Design Constraint (SDC) format already contains most of this information for use by static timing analysis, logic synthesis, and place and route. The reset-analysis tool must be capable of reading SDC files to make setup and control as simple as possible.
10. Power-control logic must already be in the RTL design.
In modern chip development flows, power-control logic is added to the design automatically during logic synthesis and place and route based on the information in the IEEE Std. 1801-2015 Unified Power Format (UPF) file. The reset-analysis tool must be able to read this file, interpret the UPF description, and add the appropriate structures (clock-gating cells, isolation cells, etc.) to its internal model to provide accurate checks. For example, if an isolation cell on an RDC signal prevents signal transitions and therefore any metastability, the tool must not report a violation.
11. Reset analysis generates a lot of “noise.”
All static-analysis tools report many warnings as well as definite errors, especially when first run on a new design. The best solutions use several methods to ensure accurate analysis and minimize the number of reported violations that turn out not to be serious issues. These include:
- SDC support to obtain precise constructs
- UPF support to accurately handle power logic
- Ability to recognize all proper RDC design techniques
- Violation grouping to enable faster examination and debug
- Standard scripting, filtering, waiving and customization options
Synopsys’s reset static-analysis solution, VC SpyGlass RDC, follows all of the recommendations mentioned. It addresses all of the myths around resets and RDCs, while providing a solution for exhaustive reset verification, with low-noise results and debug in the industry-standard Verdi Automated Debug environment. The numbers of resets and reset domains continue to grow, making this solution well-suited for RTL chip-development flows.
For more information on reset and RDC verification, download this white paper.