Bugs in RTL code are problematic, but a bug in an architectural specification can be catastrophic. If the bug remains undetected until post-silicon debugging, the design process essentially starts all over again.
Thus, it's crucial to move the verification process as far forward as possible. With that motivation in mind, engineers at Sun Microsystems recently applied formal verification to an application that commercial tools have not generally addressed—architectural-level protocol verification.
Sun's engineers modeled a complex cache-coherence protocol, with parallelism and multi-threading, for a high-performance processor. They also used a commercial formal property-checking tool from Jasper Design Automation to verify this protocol. Initially, Sun engineers used Murphi, an academic tool developed for protocol verification in the 1990s that includes a modeling language and a state model checker. Although specifically designed for architectural verification, Murphi lacked the capacity to thoroughly verify the protocol. It soon became apparent that for architectural verification, a commercial tool with much more powerful underlying formal engines would be needed.
As Murphi ran into capacity limitations, Sun and Jasper engineers worked together to convert the Murphi model to System-Verilog. They tuned the model for convergence in Jasper Design Automation's JasperGold, a process that reduced 129 state transition rules to 17 rules without any loss of coverage.
Since JasperGold is typically used for RTL verification, it took some work for Sun and Jasper engineers to build the SystemVerilog model and tune it for architectural proof convergence. The experience gained during this activity can be applied to other architectural verification needs.
Once the necessary tuning was accomplished, designers were able to verify the protocol, expose deep counter-examples, and highlight micro-architectural details that needed to be added to the architectural model. This process reduced the risk of an undetected architectural bug that could have called for an expensive redesign. It also made Sun engineers more confident about the protocol and the soundness of the added micro-architectural details.
TAKING A FORMAL APPROACH
Architectural verification is in the critical path because the consequences of an undetected architectural bug are so severe. RTL verification won't catch such a bug. It will simply show that the RTL code matches the architectural spec. An architectural specification bug may well remain undetected until post-silicon debug, causing time-to-market delays, redesign, and possibly a silicon respin.
Uncovering architectural problems early can prevent costly redesigns downstream. For example, it's not feasible to achieve sufficient coverage at the design level for cache-coherence protocol checking. That's because there's too much sequential depth. To verify a cache-coherence protocol, a tool must consider a range of traces that are both wide (in terms of starting and branching points) and deep (with long sequences of events). That's hard enough with an architectural model and infeasible with a designlevel model.
Formal verification can provide architects with an executable specification that can be queried. Thus, architects can build a model and query it in the form of property checks. For example, an architect can ask the model to show a trace that has triggered a worrisome mechanism. Architects can check for conditions such as livelock or starvation that would be difficult to find with simulation. If an architect or designer is unsure of something, such as when it's valid to check cache coherence, the answer can be found by running a property check.
Formal architectural verification also makes it possible to tune flow-control parameters, such as how many credits each agent should have, and thus optimize design throughput. A technology provided by Jasper lets architects and designers quickly generate complex input scenarios from simple queries.
Formal verification also is a good way to find and check corner cases, which are becoming more of a challenge for architectural verification. With complex protocols, it becomes extremely difficult to determine which corner cases are possible and how they might be manifested.
Protocols such as cache coherence are good candidates for formal verification because so many combinations of events could occur. It would be difficult, if not impossible, to check all possible combinations of inputs and outputs through simulation. Moreover, cache coherence is an especially important protocol to verify. A cache-coherence bug could result in a silent data corruption that would disrupt system functionality.
A formal tool for high-level architectural and protocol applications needs to be able to deal with challenges such as capacity, design style, and verification flow. In this example, what made Sun's experience successful was the collaboration between its engineers and Jasper's to adapt the tool to a new application.
HOW ARCHITECTURAL AND DESIGN MODELS DIFFER
Most commercial verification tools were developed for designlevel models. However, architectural models and design-level models are fundamentally different. Design-level models have much more overhead in terms of timing, area, and power. A specific example of design overhead is staging flip-flops. Designers will often add flops to long paths to meet timing. This adds a state that's not interesting from an architectural perspective. Another example of micro-architectural overhead is pipelining, which adds artificial sequential depth with additional combinations of events, each of which is indistinguishable at the architectural level of abstraction.
In addition to having less overhead, architectural models have much richer state spaces. Thus, a much higher percentage of state spaces of interest are reachable, potentially as high as 50% for an architectural model and as low as 5% to 10% for an RTL model. The low percentage for RTL models is due to the structural connectivity of the RTL, which limits the number of states that can be reached. In an architectural model, a rule or property may involve a large number of state variables, which calls for a more globalized analysis compared to an RTL tool tasked with evaluating the input and output of a logic gate.
At the design level, model checking has a large focus on trimming out irrelevant logic. The focus is on mitigating the design overhead and reducing the state space that needs to be explored. At the architectural level, these techniques tend to be unhelpful because of the negligible overhead. Instead, for the architectural verification, model checkers should focus on efficiently exploring a rich state space.