What Is Built-In Self Test And Why Do We Need It?

For system architects, built-in self-test (BIST) is nothing new. It describes the capability embedded in many high-availability systems, such as telephone switching systems, to execute thorough self-testing to detect the presence of hardware faults, and, if present, to isolate any fault to a replaceable unit. In this context, BIST complements other built-in system capabilities, such as parity checkers and watchdog timers, which run concurrently with system operation to detect abnormal conditions.

The need for system-level BIST results from the intrinsically poor fault coverage provided by functional testing, especially in complex systems. System recovery and diagnostic strategies rely heavily on a single fault assumption. To maximize the probability that, at most, one fault exists in a system when a recovery procedure is initiated, it is necessary to periodically test the system thoroughly to assure that a fault has not gone undetected in normal system operation.

For example, the implementation of system-level BIST in large switching systems has generally been accomplished via test/diagnostic software coupled with special hardware features to enable test-mode access to major subsystems. Unfortunately, the very complexity that drives the need for system-level BIST makes this approach quite expensive.

As a result, the thoroughness and diagnostic granularity achieved in real systems have often fallen short of desired levels. IC-level BIST, coupled with standardized test-access methods such as IEEE Std. 1149.1 (boundary scan), promises to dramatically improve this situation.

IC-Level BIST

For IC designers, BIST is a relatively new design-for-testability (DFT) technique to facilitate thorough testing of ICs. Complexity also is a key driver for IC BIST—in this case, the exploding complexity of ICs.

The economics of electronics manufacturing require that ICs be of very high quality before they are soldered onto circuit boards or into multichip modules (MCMs). Again, functional testing alone is inadequate, since it generally results in only 50% to 70% stuck-at fault coverage.

Regular Structures

Regular structures, such as embedded RAMs in ASICs, can be efficiently tested by applying test patterns that can be generated by fairly straightforward algorithms. For example, setting the contents of a RAM to all zeros, then sequentially addressing the memory locations while writing, reading and checking an all-ones pattern will detect a class of faults such as a memory location being stuck at zero.

With careful analysis of various types of possible memory faults, you can devise efficient algorithms for thoroughly testing the entire memory, not only for memory stuck-at faults but also for an extended class of faults, such as transition, coupling, retention and decoder faults. While efficient, the fully expanded set of test vectors can still be quite large and applying them at-speed poses many practical challenges, especially for memories embedded deeply in ASICs.

BIST methods can be cost-effective to test such regular structures. As shown in Figure 1, the basic approach incorporates hardware in the ASIC that generates the necessary test patterns based on the desired algorithm, applies the patterns to the RAM via multiplexed inputs to data and address paths, and evaluates the output test patterns generated.

The result is a short signature which identifies the existence of, and the possible nature of, any faults in the RAM. This approach involves modest overhead (typically less than 10%) and allows the RAM to be easily tested thoroughly and at-speed.Since they exhibit no regular pattern of circuitry, algorithmic BIST methods are not generally applicable to control logic, such as finite state machines. A straightforward approach to developing thorough tests for such random-logic circuit blocks is to hypothesize the existence of a single stuck-at fault at each logical node in a circuit. Then, based on logical analysis, derive a set of tests (if one exists) which should cause the presumed faulty circuit to generate an output different from a fault-free circuit.

Random Logic

Early in the evolution of IC technology, such tests were developed manually. For circuits with any significant complexity, however, this process proved extremely time-consuming. As a result, many companies invested in tools to automate this test-generation process and a number of sophisticated tools of this genre are now commercially available.

Although these automatic test generation (ATG) tools are sophisticated and powerful, the problem is one of increasing complexity. Two standard measures of ATG computational complexity relate to the controllability and observability of internal nodes in a given circuit. As circuits became more complex, the ATG algorithms became hopelessly computationally intensive.

The situation can be improved by increasing the controllability and observability of internal nodes. This has led to widespread adoption of scan-based DFT strategies, where the functional design is enhanced to allow test data to be inserted into and extracted from internal nodes during a test mode of operation.

This approach requires that circuitry be added to implement the improved controllability and observability needed to make the test-generation problem tractable. Given the desire to cram as much functionality as possible into an IC, designers have very reluctantly ceded IC real estate for enhanced testability.

Many strategies have been developed to minimize the DFT overhead. For example, a partial scan technique is an option in several ATG tools. With this approach, the designer can trade DFT overhead for reduced fault coverage or increased test generation and execution time. However, this can be a time-consuming trial-and-error process if significantly lower overhead than full scan is the target.

In the past two years, the level of integration and complexity of ASICs has reached the point where the cost and problematical results of partial scan have led to widespread adoption of full scan as a preferred DFT technique. DFT overheads in the 20% range now are routinely accepted. With this breakthrough in acceptable DFT overhead, the mainstream random-pattern BIST is an attractive IC-level DFT method.

As shown in Figure 2, random-pattern BIST is generally implemented by adding a linear feedback shift register (LFSR) to generate pseudorandom test patterns. These patterns are applied to a random-logic circuit block to be tested via multiplexed paths to the circuit inputs and storage elements, and the results are compressed via a multiple input signature register (MISR).

Instead of applying test sequences to detect a set of faults such as the set of all possible nodal stuck-at faults, we simply apply a large set of pseudorandom test sequences to the circuit block and determine (by simulation) which faults in the set get “accidentally” detected. The advantages of this approach during test execution are obvious. Instead of delivering a large set of test stimuli to the circuit from an outside tester, we only provide power, ground and clock signals to the IC and the circuit generates its own vectors at-speed.

There are some difficulties—the primary being unpredictable results. Depending on the details of the circuit, it may exhibit a large set of random-pattern-resistant faults, resulting in low overall fault coverage.

One effective method to improve this situation adds test points to the circuit which, in the test mode, become additional primary inputs or outputs. But each test point introduces additional area and possibly performance overhead, so we want every test point to have maximum efficiency in improving fault coverage and minimum impact on overall performance.

Sophisticated algorithms and heuristics have been developed to guide the selection of test points to satisfy these requirements. These methods, plus techniques for quickly estimating fault coverage without time-consuming fault simulation, have allowed us to develop BIST solutions for typical random-logic circuit blocks with very high (>98%) fault coverage.

If higher coverage is needed, a scan mode of operation can be designed in to allow augmentation of BIST testing with deterministically computed test vectors targeted at random-pattern-resistant faults. In combination, extremely high fault coverage can usually be achieved with a very reasonable development effort.

Datapath

Some IC circuits, such as datapath circuits, generally consist of arithmetic elements, such as multipliers or adders, coupled with storage registers and control logic. They exhibit testability characteristics in between random-logic and regular structures. Although these circuits may be considered as random logic, special techniques can be applied to take advantage of the semi-regular nature of such circuits.

Where Are We Now? What Next?

The IC BIST techniques outlined in this article have been applied in hundreds of leading-edge IC designs, with earliest applications dating to the late 1980s. Even now, IC BIST is not considered a mainstream technology. This is about to change dramatically, however, because the primary barrier to adoption (area overhead) is now obviated by lack of any viable alternative for very large and complex ICs.

Robust EDA tools to enable this switch are now appearing on the market and very significant investments are being made in the industry to further their efficacy and scope of application. Over the next couple of years, this swift and dramatic change will constitute a true “paradigm shift” for design and test.

In turn, the wholesale adoption of IC BIST methodology will drive a paradigm shift for system-level BIST. Given the pervasive use of IC BIST for custom ICs and FPGAs as well as ASICs and the continued evolution of standards for system-level test access, we will find more complex systems that employ a hierarchical test architecture and IC BIST reuse strategy (Figure 3).

With this efficient, rationalized strategy and architecture for test—from chip, to MCM, to circuit board, to subassembly, to system—the complexity of system diagnostic and recovery software will be dramatically reduced and the accuracy of diagnosis greatly increased. The end result is the capability to create and deploy exceedingly complex systems and networks that can be cost-effectively maintained at very high availability levels.

As dramatic as this scenario may seem, it is really only the beginning of a true paradigm shift in how we create highly reliable products. In one sense, both IC BIST and system BIST are really just a logical extension of DFT.

The real paradigm shift is from a computer-assisted “design + DFT” paradigm to a “design-by-computer with built-in quality” paradigm. This paradigm shift will have a truly dramatic impact on where and how value is added in creating electronic and photonic products. Look for this major change to show up in the mainstream of your design/test process within the next five years.

About the Author

Richard Campbell is founder and President of BIST Technologies, a consulting firm. Previously, he was Director of the Test and Reliability Center at AT&T Bell Laboratories. Mr. Campbell has a B.S.E.E. degree from the University of Maine and an M.S.E.E. degree from New York University. He is a member of IEEE. BIST Technologies, 3 Van Dyke Rd., Hopewell, NJ 08525, (609) 466-3957.

March 1996