Sizing Up The Verification Problem

Some people claim that verification complexity scales with a square law based on the number of storage elements in the design and the number of inputs. For example, a design with one state element and one input can be in one of two states, and in either state it can be supplied with one of two possible stimulus values. This means the design can be verified exhaustively with four vectors.

A modern system-on-chip (SoC) design includes thousands of registers, each of which may be 32 or 64 bits wide, and is likely to feature one or more processors. The SoC also likely has a substantial amount of memory, all of which can hold state information. The number of vectors required to exhaustively verify the design quickly becomes an astronomical number. Some pundits even say that the number of vectors required to verify an SoC is greater than the number of stars in the universe. This is poppycock!

The total number of vectors necessary can be substantially reduced in a number of areas. But even with all of these reductions, the number of vectors required remains large and probably larger than the amount of time and effort that can be allotted to the verification process. This is partly why verification is seen as an art. It can never be completed with 100% confidence, though surprisingly few chips are total failures. On the other hand, most chips have some bugs in them. In most cases those bugs can be avoided or workarounds provided.

Impossible Combinations

With each of these categories, we can start thinking small and extend up. At the smallest level, consider bit fields within a register. Not all bits may be used, and those unused bits don’t need to be tested. Consider an asynchronous data transmission, similar to RS232. In the old days, it used to require a minimum of two stop bits, making values for zero and one, while physically possible, illegal.

A verification engineer might want to test what would happen if the user did attempt to write these values to the control register, but this is an error test. Unless the required behavior is to stop working, it is more likely that it is specified to default to the value of two. Having functionally tested that it works correctly with two stop bits, it is not necessary to retest this behavior when zero or one was specified, only that the default value of two is used to overwrite the erroneous value supplied by the user.

At the subsystem or system level, these impossible combinations may be defined in the specification. For example, while a system may have functions A, B, C, and D, it may be defined that only two of them can be performed at a time. This reduces items that need to be verified, and it’s why constraints are written so constrained-random pattern generation won’t attempt to verify conditions that cannot happen. Functional coverage also needs to account for these kinds of impossible combinations. This can be difficult when more complex relationships exist between them, such as a shared resource that may make B impossible if and only if C is already running.

Independent States

Consider two subsystems, A and B, where subsystem A feeds data into subsystem B. Subsystem B may have options related to what it does with the data fed into it and how it does it. Unless any of those options change the way in which it receives data, then all of those states are independent of subsystem A.

In some cases, B may become available for new data sooner or later based on its configuration settings, and this can change the overall timing of the system. These situations create a control dependency between them that needs to be verified. In many cases, it can be reduced to a range of possible timing differences and not dependent on the options themselves.

Similarly, any option in subsystem A that does not modify the data fed from A to B means that it has no impact on subsystem B. B depends on the data being fed from A, but this is independent of a control dependence. Again, timing differences may need to be accounted for, but these variations can be verified at the block level and don’t need to be verified again at the system level. In some cases, especially within an electronic system level (ESL) flow, timing might not have been defined yet.

Many subsystems within a typical SoC are control-independent from each other. Consider cases where memory is used to transfer data between two subsystems. One subsystem has no idea where the data came from or who is going to use the data it produces. Thus, there can be no dependence. This observation is important when defining verification strategies, and it brings us nicely to the subject of hierarchy.

Hierarchies

In the previous example, while there are things that have to be verified at the module or sub-system level, it may be necessary to consider the external effects of a block when conducting verification at the next level of the system hierarchy. In addition, when a block is reused, it is unlikely that all of the capabilities of that block will be used within the system.

Those states that cannot be reached at the system level should not be considered. This does not mean that they should not be verified at the block level, because any and all capabilities in the design must be verified at some point.

Consider another example. Two functional blocks communicate with each other using a transport mechanism, such as a bus protocol. If that transport mechanism is working correctly, it will not change system behavior. If that is ascertained independently, then the entire transport mechanism can be removed from consideration at the system level.

The number of significant states in the system goes down as the user migrates up in the hierarchy even though each of those states becomes bigger and more important. At the highest possible level, the only states that really matter are those that define the functionality defined for the system in the requirements document.

This also defines one of the problems with a bottom-up verification strategy as employed by most companies. At no point in the definition of a testbench is it possible to define internal and external variations and which ones are important when verification is to be performed at the system level.

In a strategy following the Universal Verification Methodology (UVM), virtual sequences control the lower-level virtual sequences and the leaf sequences. These same sequences are used for verification at the lower level and, in effect, migrate all of the possible ways in which that block can be operated without regard for dependencies. This makes system verification a larger burden than it needs to be and one of the reasons why many companies still employ directed testing for system verification.

Implications For Top-Down Verification

A top-down methodology that defines the ways that blocks depend on each other and the services that they can perform is required. With this defined, it becomes possible to extract a verification goal from the requirements document and to ask how many ways that objective can be reached with blocks as defined?

Randomization can be employed to select the path through the design that will meet that verification goal. Or, the user can manually select the path. One methodology in which this can be done uses a graph-based approach to verification that is constructed incrementally. A minimum set of services initially can be defined for the blocks that may lead to a few ways in which a verification goal can be met. As the definition of those blocks is extended, or new blocks added to the system, additional ways become possible and the user can decide which, or how many, of those ways need to be verified.

A principal difference in this methodology is that it does not depend on the testbench being complete before verification can start. It also ensures that the most important aspects of a system are verified first. In many cases, system verification could be performed even before the blocks have been implemented and act to ensure that the specifications for those blocks are correct.

There may be many more ways in which the total verification space can be reduced, and there are certainly priorities in terms of the things that should be verified first. I welcome comments, feedback, and other ways to reduce the state space further.

Adnan Hamid is cofounder and CEO of Breker Verification Systems. Prior to starting Breker in 2003, he worked at AMD as department manager of the System Logic Division. Previously, he served as a member of the consulting staff at AMD and Cadence Design Systems. He graduated from Princeton University with BS degrees in electrical engineering and computer science and holds an MBA from the McCombs School of Business at the University of Texas.