Historically, testability is an afterthought in the design process. But heightening complexity of chip designs, and especially SoCs, forces testability (and manufacturability) to take a more central position. It's no longer enough for test engineers to insert DFT scan chains into a flattened chip layout after synthesis. Now, embedded test structures are being thought of as the first line of defense for failure analysis and yield improvement.
A shift is afoot in which EDA tools will eventually pass more information forward to test engineers. Conversely, and perhaps more importantly, information gained from failure and yield analysis will wend its way back to designers. In the process, designers stand to learn more about how to make their products more testable from the outset. In the past, design-for-test (DFT) was considered by designers to be a non-value-added element. But that perception is turning around rapidly, with DFT gaining much more attention from front-end designerseven at the RTL stage.
DFT techniques have been applied to chips as a mechanism for weeding out yield casualties. It's understood that there will always be yield fallout in any silicon fabrication process. It's also understood that yields will continue to decrease as design rules descend further into the nanometer realm.
Furthermore, in the SoC world, designers must incorporate intellectual-property (IP) blocks from many sources, some internal and some external. Designers may not know very much about what's going on inside a given IP block other than a functional specification. Thus, they're forced to rely on DFT to bring test access to that block out to other parts of the overall chip design and, ultimately, to the outside world.
The traditional way to apply DFT is to add scan structures to facilitate testing via automatic test-pattern generation (ATPG). This is accomplished by connecting all of the design's registers in serial fashion, allowing test engineers to shift data in and out through a few ports at the chip level (Fig. 1). That allows, for test purposes, access to not only the pins of the device, but also to the internal registers. That's been the mainstream technology for manufacturing test over the last 10 years, and it probably won't relinquish that status anytime soon.
However, as multi-megagate designs are migrated down into 130- and 90-nm silicon processes, the effectiveness of scan-based DFT techniques is diminishing (Fig. 2). Some of the lessened effectiveness can be attributed to failures in following DFT rules, but much of it is traceable to the shift to smaller geometries. Traditionally, the metric for determining the effectiveness of tests is the stuck-at fault model. A stuck-at fault is a hard, static failure in which any node is stuck at zero or one. Typically, ATPG techniques are used to create static test vectors to detect these faults.
For memories, built-in self-test (BIST) techniques are the preferred method. BIST adds test circuitry to the design and applies the memory test from these circuits on-chip.
Historically, ASIC suppliers have preferred to see fault coverages of greater than 90%. However, for designs with 1 million gates fabricated on a 0.25-µm process, 90% coverage would produce a defect rate of 0.28% or 2800 ppm. This is unacceptable considering that these defects won't be found until devices are assembled on pc boards. Even when all DFT rules are followed, it's very unusual if the fault coverage for a conventional ASIC design exceeds 98%. If DFT rules aren't followed, a project will suffer the consequences in device fallout after board assembly.
The real problem is that at nanometer geometries, other types of faults begin to appear. Primarily these are speed-related failures. Speed-related (or, as they're sometimes referred to, at-speed) failures aren't static, but rather better characterized as resistive failures. In this case, resistive nodes or bridges cause given nodes to be slow to rise or fall.
For example, resistive vias can cause errors in high-speed circuits but still pass stuck-at testing. Transition fault models test whether the circuit is transitioning properly or not, while transition delay fault models determine if the delay between two logic values is acceptable. Path delay fault models test for delays along a predetermined path caused by resistive-capacitive coupling with other paths. None of these failures will reveal themselves through purely static stuck-at testing.
As a result, designers are on the move toward adopting an at-speed test methodology. When multiple clock speeds are used in separate segments of the design, at-speed tests for one clock segment across the entire design boosts test complexity. It also increases the patterns required to identify the faults. Microprocessor manufacturers have been doing at-speed test for years, but it's beginning to spread more widely throughout the industry. Manufacturers such as Intel have typically relied on functional testing to snare speed-related failures. Functional test exercises the chip as though it were in its target system. It doesn't use scan, which lends itself to precise fault-coverage metrics. As a result, a downfall of an at-speed methodology is that it's very difficult to determine whether or not you've covered the entire design.
Designers are seeking ways to supplement their standard test methodologies. They typically contain static and stuck-at components to include an at-speed methodology. These methodologies aren't mutually exclusive, but rather complementary.