Historically, wireload models have been inadequate for accurate modeling of wire delays. Furthermore, the inaccuracy worsens with each new process generation. Logic designers see one timing representation of their design, and physical designers see something entirely different. This discontinuity impacts the success of the project in several ways.
For instance, if the wireload models understate actual physical wire delays, then logic designers will see a more optimistic timing representation than what appears in physical design. This typically leads to long iterations between the logical and physical design teams, as each team sees a different timing representation and makes optimizations based on disparate assumptions.
Alternatively, the logic design team might build in timing margin by using wireload models that overstate real physical wire delay—a pessimistic timing representation—in which case the finished design will use larger and higher-power cells, even for non-timing-critical logic. This means a chip that will either be larger or more congested than it needs to be, and that will consume more power than it should. This tradeoff is no longer acceptable in today’s market, where much growth comes from high-volume, low-margin consumer products for which power dissipation is a key concern.
The bottom line is that it’s now essential to develop a design methodology that more effectively captures physical effects during logic design and implementation.
The Current Scenario
Today’s complex nanometer designs embody a combination of these effects, offering, in some cases, the worst of both worlds: pessimistic wireload models that underestimate some of the long routes. The outcome is that these long routes still are underpowered and will cause timing closure failure and multiple iterations between logical design and physical design teams. Meanwhile, the remainder of the logic may be overdriven, consuming too much area and power.
The underlying problem is that the logic design team creates and hands off gates without any real insight into physical wire effects. The true measure of a design’s integrity is “quality of silicon” (QoS), which is timing, area, and power measured with wires. However, this measurement cannot happen until some physical implementation is complete. Until then, there’s a gap between what the logic design team creates and what the physical design team sees (Fig. 1).
Physical synthesis was developed to address this problem by using placement and some form of routing to achieve more accurate wire delay information. However, physical synthesis is also bogged down by all the detailed data associated with it, and is constrained by the initial netlist and placement. The result suffers from low capacity and long runtimes; it’s also limited to local incremental optimization capability. This works well in physical design, where the detailed accuracy is necessary—physical synthesis provides more optimization power than was previously available here. But in logic design, all these details are not yet needed—it’s only important that you have enough accuracy to know whether you’re moving uphill or downhill on the cost function.
In addition, design-for-test (DFT) logic can affect, and be affected by, physical implementation. Scan-chain connection that is ignorant of register placement can result in both setup and hold timing violations in physical design, not to mention excessive amounts of routing overhead. And DFT logic such as compression, BIST, and boundary scan need to be placed near I/Os and macros, often causing congestion and blockages that add to the timing closure challenge.
As a result, design teams need to do the best job they can at modeling physical effects at every level of the design process without sacrificing optimization capability. For logic designers, wireload models enable full synthesis optimization capability, but are inadequate for modeling timing. Physical synthesis provides more accuracy than is needed at the logic level, and therefore does not offer the necessary optimization capability. This is the source of the disconnect between logical and physical design. However, with physical wire effects dominating the results, what can be improved to improve physical timing accuracy while still providing the optimization capabilities required by logic designers?
Improve Physical Modeling
Physical synthesis tools may claim to synthesize RTL to “placed gates,” implying placement-aware RTL synthesis. However, placement cannot be performed until gates are available—and gates are not available until RTL-to-gates synthesis is performed. What do engineers use to model wire delay during the RTL-to-gates step? Either inaccurate wireload models or, even worse, zero wireloads. Physical synthesis tools claim that RTL-to-gates synthesis is not important to quality of silicon (QoS). However, new global synthesis has proven that RTL-to-gates synthesis is key in improving chip frequency, area, and power. So how can design tools provide a real-enough physical timing representation to the RTL-to-gates synthesis process?
Full Synthesis Optimization
Physical synthesis greatly increases the optimization capability of physical design. From the logic designer’s viewpoint, physical synthesis does less optimization than logic synthesis because it starts off constrained by a placed implementation. During logic design, optimization decisions are at a higher level: what kind of adder architecture to use, whether to employ resource sharing, and so on. These optimization decisions cannot be performed once the design is at the netlist or placement level. It would help to have some level of physical timing reality as input, but this level of optimization does not require the type of detail associated with placement and routing.
Don’t Force The Physical
Physical synthesis tools require too much physical knowledge for most logic designers to effectively run and analyze their results. Yet physical synthesis of a netlist, constraints, and a physical library without a floorplan or other physical parameters can produce less than optimal results. SoC designs today have hundreds (if not thousands) of macros, use multiple power supplies and complex I/O schemes, and present many other challenges that need to be considered during implementation. And if the design does not meet timing after running physical synthesis, then what? All analysis in physical synthesis is gate level or physical. The ideal solution is to provide realistic physical timing information while maintaining a use model in which logic designers can still be productive.
While there is not a single magic bullet that can address all of these issues, there are new methodologies that can be employed at each step of the design process to bring physical timing into the logic design world in an effective manner.
RTL-to-gates synthesis
This is the most important implementation step in terms of optimizing your design goals with respect to timing, area, and power. The point at which the global logic structure is created is the starting point for all incremental optimizations that will follow. But until gates are created, there is no way of knowing actual placement and wire delay. However, there are several modeling techniques that represent significant improvements over conventional static, fanout-based wireload models.