We'd also like to recognize the fourth co-author of this article, Gabriel Chidolue, who is a Solutions Architect in the Questa functional verification group at Mentor, a Siemens Business.
The debug challenges of low-power verification are complicated by sophisticated power-management architectures and techniques used in low-power designs. Designers use complex power-aware techniques such as power gating, voltage scaling, and body biasing to save power and minimize heat dissipation. As a result, the power-aware verification of chips is a fairly complex process.
The Unified Power Format (UPF) standard has provided many new capabilities that have eased the power-intent specification process as well as enabled new power-management verification flows aligned with the needs of IP-based SoC design today. Unfortunately, the evolution of the UPF standard hasn’t reduced the complexity of the power-management verification task whatsoever. Moreover, traditional debug technologies and methods assume that a design is “always-on” and thus fail to address new and complex power-related issues.
Let’s take a look at six low-power debug challenges and common pitfalls, along with some ideas on how to make them easier to solve or avoid altogether. While none of these debug challenges will be new to those involved in power-management verification, our intent is to provide a practical guide to the uninitiated based on actual user experiences.
1. Unwanted X Values
A major debug problem in low-power simulations is root-cause analysis of unknown (X) values. There can be several reasons why X values appear on signals in low-power simulations. These reasons can be either an incorrect UPF specification (missing isolation/level-shifting retention cells or improper power domain partitioning), or UPF 1.0 to UPF 2.0 simulation semantic differences due to usage of initial blocks (re-evaluation and races between signal initialization and domain power up).
To distinguish a normal unknown value on a signal, most simulation tools are able to highlight unknown values caused by power-domain corruption in wave windows. In normal simulations, unknown X signal values are typically displayed using either a single mid-high red line or a red outlined box around the unknown value region in a wave window. In low-power simulations, the entire low-high region is filled in using red-colored, coarse, or fine-grain cross-hatching to indicate that the X value is a result of direct power-domain corruption (Fig. 1).
1. The entire low-high region in a simulation wave window is filled in using red-colored, coarse, or fine-grain cross-hatching to indicate that the X value is a result of direct power-domain corruption.
Typically, the corruption highlighting is only visible on the direct output of the driving logic and isn’t displayed on the outputs of subsequent logic that the corrupted net fans out to. Even when an unknown X value signal isn’t highlighted in the wave, the ability to trace the net connection backwards to where a highlighted signal is visible can greatly simplify the task of determining the origin of the X value. Most EDA vendors provide wave compare tool to identify the root cause of these unwanted X values.
If wave compare isn’t available, another useful technique for catching unwanted X values is correlating their occurrence with changes in: power-domain sim-states or power states, power-control signals including those for power switches, isolation enables, and retention save/restore signals. Most power-aware simulators provide the ability to print low-power related messages. These messages typically are about the change of state in supply nets/ports, power switches, various power-control signals, and power domains. Assertions also help in detecting sources of X values due to issues related to power-control sequencing.
Besides low-power assertion checks to catch possible sources of X values during simulations, static analysis, when processing the UPF file, can determine if there are any missing, redundant, or invalid isolation/level shifters, which can help catch them as well. An un-isolated power-domain port will obviously propagate an X value to other power domains when it's powered down.
In a non-power-aware simulation, the driver of a signal is usually some RTL logic. However, in the case of a power-aware simulation, it could be anything ranging from RTL logic to UPF inserted cells. For the cases of an unexpected value on a signal, it might be the effect of power-aware activity on that signal, or it could be a propagated effect of power-aware activity on some of its driver signal/logic. In these cases, driver-tracing is a useful way to find the driver signal and its value.
Another useful feature is dataflow/schematic debugging of a signal. This helps with tracing the value of a signal that’s being driven from a distant logic. In power-aware debugging, it also displays the UPF inserted cells in the dataflow path. Figure 2 is an example of a dataflow from a simulator showing an isolation cell and a level shifter encountered in the dataflow path of signal “/tb/out1”’.
2. A dataflow from a simulator shows an isolation cell and a level shifter encountered in the dataflow path of signal ‘/tb/out1’.
2. Some Signals Aren’t Corrupted
A very common low-power debug issue occurs when a certain part of a design fails to switch off and the logic inside that part is never corrupted, although the user expects it to be switched off and show corrupted values. There could be multiple reasons for such behavior.
With an incorrect power-domain specification, the design element of concern has been put in a power domain that’s not switchable. The first step for debugging such an issue is to identify the power domain to which this region belongs. This can be done either by looking at the power-aware textual reports generated by the tool or using the tool’s GUI capabilities. The next step is to determine if the power of that domain is being switched off or not. It can be verified using the dynamic messages reported by the tool for whenever a power domain changes its state.
Exclusion of a behavioral model can occur when EDA vendors provide mechanisms to skip the corruption on certain portions of a design; such as specifying DONT_TOUCH elements via the UPF or excluding elements from power-aware behavior using a separate file. In such cases, look at tool-generated reports to find out if the design element has been excluded by the tool or not.
Disabling simulation semantics of a design element means the tool will not impart any power shut-off/corruption on that element. The tool disables the power-aware simulation semantic and, hence, there will be no tool-injected corruption. All such regions can be easily identified using messages reported by the verification tool, such as:
** Note: (vopt-9693) Power Aware simulation semantics disabled for chip_top/u_hm_top_0/u_ip_1
3. Illegal Power States and Transitions
Today’s low-power designs have many operational modes. One of the major debugging tasks for low- power design is verification of the design’s operational power states. This requires verifying that each defined power state of every power domain has been covered and functioning properly. It also requires verification of all power state combinations across all domains that comprise each operational power state. The complexities of the verification process increases multifold as designs continue to increase in both the number of power domains and operational power states.
UPF addresses these challenges by providing the add_power_state and describe_state_transition commands. Not only does add_power_state support bias states, hierarchical power-state creation, and an incremental update capability, it also allows any named power state to be declared as legal or illegal. Using these two commands requires low-power simulators to issue run-time error messages whenever an illegal power state is reached or upon the occurrence of any illegal power state transition.
The UPF also stipulates that an unnamed or undefined power state is illegal. The detection of undefined power states is especially useful if an unintended power state occurs when transitioning from one defined state to another. The occurrence of an undefined power state during a legal power-state transition may result due to a race between changing a UPF supply via the UPF package supply_on/supply_off functions and switching of a power-control logic signal at the same time. A race-induced undefined power state likely indicates an area where voltage ramp-up/down times versus logic switch times must be accounted for to ensure proper operation of the design.
However, at times, all of the fundamental power states of an IP are not defined; this can result in unexpected illegal state messages during simulation. Users can make use of UPF 3.0 syntax that allows for the specification of the set of power states for a given object to be marked as complete, which indicates that all fundamental states of the object have been defined as named power states.
If the set of power states for an object is complete, then it shall be an error for the UNDEFINED power state to be the current power state of that object. It’s also an error if a new fundamental power state is defined after the power states are marked complete. If the power states of a given object aren’t marked as complete, it’s assumed that all fundamental states haven’t been defined and will not be marked as erroneous behavior.
Likewise, the describe_state_transition (UPF 2.0/2.1) and add_state_transition (UPF 3.0) commands enable any transition between two power states to be declared legal or illegal. UPF 3.0 also allows for defining a group of related power states, which can then be used in the add_power_state command. The legal power states of a power-state group define the legal combinations of power states of other objects in this scope or the descendant subtree; i.e., those combinations of states of objects that can be active at the same time during operation of the design. This command can be used to define the illegal power-state combinations.
4. Power-Intent Specification Complexities
The specification of power intent for power management of low-power designs has been addressed by the UPF. However, the UPF standard is still evolving with new features, concepts, and clarifications being added over the releases. It often poses problems related to backward compatibility, differences, and migration issues, which are then difficult to debug.
In UPF 1.0, the UPF supplies defaulted to the ON state. Many verification and design engineers involved in UPF 1.0-based power-aware simulations have unknowingly relied on this fact and, as a result, there has been a tendency to not use the UPF package-defined supply_on function to explicitly turn them on. Thus, a previously passing UPF 1.0-based simulation might fail after a user switches to UPF 2.0-based power-aware simulation semantics.
A common reason for many of these failures is that in UPF 2.0, the UPF supplies default to the OFF state, causing all power domains to be in a CORRUPT simstate. This common migration issue can easily be avoided by using the UPF package-defined supply_on function to explicitly set both UPF state and voltage values for all of the created UPF supplies.
Isolation of lower-boundary power-domain ports is another migration issue. In UPF 1.0, the set_isolation -applies_to inputs/outputs port filters only considered power-domain ports that were aligned on a module boundary. In other words, only the input and output ports of modules were isolated. In UPF 2.0, these isolation port filters have been extended to also include the lower-boundary (child module instances in different power domains) input and output power domain ports as well. The concept of lower-boundary power-domain port isolation can potentially cause simulation failures as a result of unintended back-to-back isolation cells inferred by tools, especially if isolation strategies were included for any lower-level power domains.
Another debug challenge in power-intent specification arises because of the usage of list/wildcard expansion in various UPF commands. It often happens that an incorrect list of signals is created as a side effect of usage of the wrong pattern in wildcards. This problem can be avoided if the user relies on the UPF command find_objects to create a list of signals and uses the tcl command puts to print the contents of the element list.
Another way to get the list of expanded signals is to use the save_upf command. With the save_upf UPF command, the verification tool will dump out interpreted commands in a new UPF file, which will then contain lists of expanded elements. Another safe bet is to avoid using Non-LRM wildcard usage and rely on UPF command find_objects.
In the case of a macro model in the design, the power supplies along with the power-aware functionality are present inside the model itself. When integrating this macro model in an SoC, the integrator must connect these HDL supplies to the UPF nets.
Since the UPF supplies are of type supply_net_type having state and voltage values, and the supplies defined in HDL are of wire type, the connection between UPF and HDL net requires a value conversion table (VCT). The VCT defines the mapping between the state of the UPF net and the value of HDL port/net. The user can either rely on verification tools to apply a default VCT or explicitly specify which VCT to be used using the UPF command connect_supply_net –vct. Be aware that a problem arises when the same VCT gets used for power/ground/pwell/nwell supply nets. This causes power-up failure because ground/nwell supply nets are active low and expect a ground specific vct.
5. Supply Network Issues
At the implementation stage, the supply network in a power-aware design is often huge and highly complex. At times, it gets buggy and difficult to debug the improper connections or any other issues in the supply network (either present in just the UPF files or already implemented into the design). You can debug the supply connections using either static or dynamic debugging methods. This is an issue where a good UI and connection reports from EDA tools help a lot.
6. Power-Up Failures
A common debug scenario occurs when a low-power design fails to power up after a power-down period. The problem can arise because of missing or incorrect isolation/level shifters, incorrect retention behavior, and macro-cell corruption.
Missing or incorrect isolation/level shifters can be debugged by performing static verification at compile time, using tool-generated assertions, and employing UPF bind checkers. Many tools statically determine the need for isolation and level-shifter cells at domain boundaries from the PSTs and power states described in the UPF.
3. Missing or incorrect isolation/level shifters can be debugged by performing static verification at compile time, using tool-generated assertions, and employing UPF bind checkers.
The PSTs describe the valid interacting states between two domains. These states clearly define the voltage ranges in which two domains are interacting and whether one domain is ON in relation to another domain or not. If a signal is going from a domain of a low voltage range to a high voltage range, then there’s a need for a level shifter with the “low_to_high” rule. Similarly, a need arises for the “high_to_low” rule for signals going from high voltage to low voltage. It’s an error scenario if no level-shifter strategy is specified or a level-shifter strategy with a different rule is specified (Fig. 3).
System Verilog Assertions (SVAs) are a very powerful way to achieve the dynamic verification of low-power designs with automated, tool-generated assertions that check for missing or incorrect isolation and level shifter at run time. For each interface signal at the domain boundary, an assertion is inserted that would check the need for isolation and level-shifter cells.
Assertions can be used to validate power-control logic sequences and ensure that specific requirements are met before and after power-mode transitions. The UPF also provides a way to add custom assertions using the bind_checker command, allowing you to write your own assertions in a module and then bind that module to boundary instances to check whether a signal is isolated or not.
The UPF bind_checker command also provides the easiest way to debug incorrect retention behavior by allowing you to dump custom retention assertions using bind_checker with UPF generics. UPF generics provide an automated way to specify generic bind_checker statements for all sequential elements of a design, so you don’t have to worry about clock/reset conditions for each sequential element. Simulation tools automatically identify the clock/reset conditions for each sequential element and apply the bind_checker statement accordingly.
The corruption of macro cells is governed by the liberty attributes that use the primary supplies of macro cells to define the corruption semantics. Thus, the primary supplies of macro cells should be properly connected for the correct functionality of liberty cells. One of the main reasons for cell power-up failure is incorrect supply connections, so it’s necessary to verify the supply connections of each cell. Most of the EDA vendors dump the connection report for all supply connections, and you must verify that the supply ports are properly connected by checking these reports.
Conclusion
Debugging of low-power designs is one of the toughest challenges currently facing the semiconductor industry. Being cognizant of the various challenges in debugging low-power designs enables low-power issues to be detected early or avoided entirely, thereby saving costly design cycles and significantly increasing the productivity of the debug process.