We’ve all been there. There’s one more task we want our portable device to perform, and it powers down just as we’re almost done. Darn it. And even for wall-outlet powered devices, the effects of heat are significant. I just disassembled one of my older laptops and the amount of dust right around the fan had prevented air circulation so much that it went into heat protection mode by forcing the main CPU into lower gear, making it unbearably slow.
Power has long been a topic of much debate. For certain applications, power has become the number one decision criterion, but often, it still isn’t considered more important than performance and area. Several startups, some focused on low-power high-level synthesis, others focused on optimizing software for lower power, have come and gone. The industry has, however, included power into the new term PPA, short for performance, power, and area, to describe design optimization criteria.
Decisions, Decisions…
When considering power in their decisions, like they would for any component of architecture analysis, designers are facing tradeoffs between the accuracy required to make proper decisions versus the time that accuracy becomes available in the design flow versus the ability to create and execute enough test cases to represent all design scenarios in question. Unfortunately, these three criteria are working against each other.
With increased complexity of designs, the impact of even small architecture decisions is hard to quantify, so ideally full accuracy would be available at high execution speed and before implementation has been completed. The accuracy versus availability predicament can partly be worked around by providing architecture models for reused components like interconnect fabrics. Still, for new models, early estimates have to suffice initially.
In relationship to execution speed, which is required to run as many scenarios as possible, both time of availability and accuracy prove hard to reconcile. For new parts of the design, transaction-level models (TLMs) can be available first given that they require less development effort than the full register transfer level (RTL).
Given their higher level of abstraction, they also execute faster. They are, however, less accurate. The effort to annotate meaningful approximate information for timing and power is still much lower compared to the full RTL development and verification, but already significant. The most critical issue becomes how to get the right power information.
Today, three sources are available to designers. First, measurements from the last design—the previous silicon—can be used. Second, early power assumptions and estimates can be used and executed as annotations to TLMs. Third, actual measurements from the RTL as it undergoes implementation and verification, combined with power estimations using logic synthesis, provide refined, more accurate power information for the design under development.
Once the actual RTL becomes available, various engines from simulation to emulation/acceleration to FPGA-based prototyping provide different speeds and times of availability during a project. Bottom line: given the various choices across accuracy, time of availability, and execution speed, refining decisions as quickly as possible becomes most important. In the absence of a “one-fits-all” solution, the time until more accurate power information can be annotated back into TLMs to confirm assumptions or identify them as wrong becomes crucial.
The designer’s dream solution for low power would combine all three—accuracy, early time of availability, and execution speed—and offer productivity, predictability, and flexibility to compromise between risk and cost.
For productivity, a high-performance, automated approach to system-level analysis would enable designers to run real application tests, real hardware/software “what-if” scenarios allowing adjustments to the design during the project without negative consequences to the overall schedule.
Better predictability would support the ability to measure power during various design phases, including the ability to measure intellectual property (IP) and various architectural options and when to run application tests.
With respect to risk and cost compromises, the solution would provide adequate accuracy of average and peak power at various design phases to make sure the design stays within allocated budget, allowing design teams to run realistic scenarios and make well-informed decisions regarding power management strategies, IP selection, and packaging and cooling technologies.
What’s Next
Are we there yet? Well, we’re getting closer. While RTL simulation was the only way to get the appropriate switching information to feed into power estimation in logic synthesis and then annotate power information back into TLMs, it was plagued with low execution speed, resulting in severe limitations to run a sufficient number of design scenarios for proper power estimation.
More recently, emulation/acceleration has been enhanced with dynamic power analysis to address the speed shortcoming of RTL simulation. It effectively allows top-down design and bottom-up design to be merged, identifying and analyzing peak and average power at the system level and IP/subsystem level for RTL and gate-level.
The analysis of power profiles for hardware and software helps designers make system-level tradeoffs. It also can occur early enough in a design flow to balance performance and power consumption by changing the micro-architecture, hardware implementation, power budget, and/or application software.
Also, with the support of power formats used to capture power design intent, power shutoff verification at system-level becomes possible. This enables IP reuse and portability without intrusive instrumentation and leverages the debugging capabilities of emulation/acceleration systems for low power verification.
Eventually we will get close to a seamless environment from TLM to RTL, allowing early power estimates to be automatically confirmed once RTL becomes available and can be executed in simulation, emulation/acceleration, and FPGA-based prototyping. Analyzing and confirming power consumption dynamically using emulation/acceleration is an important step to shorten the turnaround time to annotating power information back into TLMs as well as to validating architectural decisions early enough so they still can be addressed in the current design.