Use An Arb And A DSO To Find Elusive Bugs In Comm Software
Design engineers working on digital communications products often need to verify specifications stipulated in complex design criteria. Anyone who's ever designed a wireless modem, for example, knows it's not easy to test it for all the different reception failure modes before bringing it to market. With digital radios or other digital products required to work over long distances, it's difficult to simulate on the laboratory bench all the different scenarios that might occur in the real world. An arbitrary waveform generator (arb) and a digital storage oscilloscope (DSO) can be vital, then, to a company's R&D efforts.
An arb can output just about any signal pattern imaginable. So, virtually any form of signal "corruption" can be added to a starting signal that's known to be good. These modified signals can be saved and then used repeatedly at any time for product verification during the design phase.
Many digital communications products include software that monitors the received data, hunting for bit or packet errors while performing command and data processing. Some of these software systems are even capable of forward error correction. Test procedures can ensure that this software is, in fact, detecting bit and packet errors correctly. We'll cover five areas of error testing—signal-amplitude tests, asynchronous-data-timing errors, forced-bit errors in varied positions, noise testing, and command-spacing limits.
Our test setup consisted of an HP 54645D mixed-signal oscilloscope (MSO), an HP 33120A arbitrary waveform generator, and HP BenchLink software. Even though we chose the HP 54645D, just about any scope with a memory buffer large enough to store the 20 to 50 samples/bit contained in the analyzed data-packet structure will work.
The scope must have an IEEE-488 general-purpose interface bus (GPIB) or some other interface for easy uploading of data points to a computer. We chose the HP MSO because of its very large memory buffer (1 Msample/channel). And, it's available for under $5000. As a starting point, a scope with a 2- to 4-ksample buffer or more should be used.
The HP 33120A arb was selected for its features, including outputs of up to 40 Msamples/s with 12 bits of resolution. It also costs less than $2000. When simulating data packets over 2 Mbits/s or very large packets such as those used for computer local-area networks (LANs), an arb with faster sample rates or more memory might be required. These improvements, however, also will mean a dramatic jump in the arb's price.
Hewlett-Packard's BenchLink software is a dedicated Windows program meant only for use with HP scopes and the 33120 arb. It makes the cut, paste, and edit process extremely simple, but it doesn't lend itself to more advanced instrument control. Advanced automated instrument controls are best left to programs such as HP-Vee and LabView. While these programs are much more powerful than BenchLink, they require programming to use. Allowing for some programming time, though, both HP-Vee and LabView could do this job equally well.
The device under test (DUT) was LPA Design's FlashWizard II radio transceiver. This remote-control photographic-equipment system is used at major sporting events, such as professional basketball games. It can remotely synchronize up to 32 cameras to a single flash of light at a shutter speed of 1/250 of a second, using on/off-keyed (OOK) signaling at 68 kbits/s to send and receive commands.
The first step is capturing a "clean" data packet, typical of the command codes the radio normally receives, on the scope. To do this, we used HP's deep-memory MSO connected to the received-signal-strength-indicator (RSSI) signal in the FlashWizard receiver. The transmitter was located only a few feet away on the bench to achieve a good signal-to-noise ratio (SNR). Using a GPIB interface and the BenchLink software, this packet waveform was copied to the computer. The capture was completed using 2000 sample points to represent the packet. The packet contained under 100 bits of information, equating to more than 20 analog sample points per bit.
It's important to have a sufficient number of samples per bit to prevent the arb from adding too much distortion to the packet. As a guideline, about 20 to 50 samples should be used for each bit in the packet.
With a "clean" error-free packet now on the computer, we copied this waveform to the Windows clipboard. Then, we transferred it to the BenchLink/arb program for permanent safekeeping and future reuse (Fig. 1).
This error-free waveform was transferred into the arb using the GPIB bus and BenchLink/arb. The playback rate and amplitude were adjusted to match the original waveform, with the signal verified on the MSO. It's best to use the arb in the single-shot burst mode for this operation.
The next step was verifying that the error-free packet worked on the receiver, the DUT. This was done by connecting the arb to the DUT's baseband detector input (the RSSI line in Figure 2) and disconnecting the radio receiver subsection so it wouldn't conflict with the arb's simulated signal. After seeing error-free reception, fault testing followed.
The variations in signal amplitude are often very dependent on distance, battery levels, and the transmission medium. On the signal's receiving end, the circuit used to decode (quantitize) it back to digital logic levels may be sensitive to amplitude variations. In this particular application, a voltage comparator on the DUT was used to digitize the signal back to logic levels. Then, using the arb's front-panel control knob, the amplitude was ramped up and down to find the upper and lower signal-level limits of operation for the DUT.
The basic front-panel controls and one-line alphanumeric displays on low-cost arbs definitely aren't very useful for downloading waveforms by hand. But they are quite user-friendly for operation once the waveform is in the arb's memory.
The FlashWizard utilizes a CPU with a crystal as its clock/timing reference. We wanted to know how far off frequency this crystal could be before errors would result. The frequency-adjustment mode on the arb's front panel was used to skew the packet's playback rate and introduce timing errors. Next, the frequency was adjusted up and down until the upper and lower timing limits that yielded proper reception were found.
This test proved to be very useful. While the playback rate could only be raised by 0.03%, it could be lowered by almost 1% and still receive an "OK" response. Consequently, there was very little margin for error in one direction, but plenty in the other. We concluded the receiver-software asynchronous bit-sampling rate might not be correct relative to the transmitter. A closer investigation of the software timing loops indicated that this hunch was accurate. A quick software adjustment inside the DUT enabled it to tolerate a ±0.5% timing error. This was a good example of software that might work on the bench, but once in production, could lead to marginal yields due to the cumulative error in circuit parts.
The next task was making sure that the receiver's software was detecting bad data packets. To do this, the clean original waveform was modified with BenchLink's editing tools. This entailed visually locating the position of a bit in the packet, "penciling" it in from 0 to 1, and then downloading this altered waveform to the arb and retesting it (Fig. 3). Any desired timing position can be located and altered if the time per sample is known.
The MSO's logic-channel inputs let us view the processor's response to this bad bit. The data packet used for the test was a command code for saving a setting in nonvolatile memory. By probing the memory chip, the fact that data was not written to it after the command came in confirmed that the error was being detected.
Continuing to alter bits in other parts of the packet and repeating the test showed that the software seemed to be detecting the errors regardless of bit-error position. Surprisingly enough, after reloading the original clean signal and testing it once again, the software didn't revert back to working correctly.
A round of debugging with the arb and MSO found the problem. Once any bit error occurred, the software was locking out all packets because individual error flags weren't cleared once they were set. This is a clear example of the value of forced error testing, since the radio always worked when sitting on the lab bench in "clean" signal conditions. A good rule of thumb is to create a library of all the command packets used by a product, as well as several modified waveforms of each packet. Whenever any new revisions of software or hardware are made, tests can be run using all these waveforms before any release is made to production.
When designing the communications link in this application, some standard tables were referenced to see which SNRs would yield a reliable bit-error rate. In reality, we needed more signal than we originally thought was necessary to get the results required. The received signal shown on the scope looked clean enough to use, yet it wasn't clean enough for the radio.
To combat this problem, the MSO was set up to trigger off a CPU output pin that would be set via software every time a packet error occurred. With the RF front end reconnected to the circuit and the arb removed, several waveforms were captured that failed even at high signal-to-noise levels. These were copied to the BenchLink/arb, and the arb test setup was reconnected.
The arb was placed in continuous-playback mode, and it was evident that failures were still resulting during every replay cycle. Adjustments to the detector circuit were visible in real time on the MSO due to the continuous-playback mode. During the zero-bits (carrier-off) phase, the noise present on the waveform was causing random ones to appear in the detector output. The addition of a diode enabled the circuit to dynamically adjust its detection threshold and eliminated the problem. Figures 4a and 4b show "before" and "after" results.
Like most digital radio systems, the objective is to optimize data throughput within a given bandwidth and CPU speed. When a command packet is received, it takes the CPU some time to act on it and do something with the attached data. If the transmitter were to send another command too quickly, the receiver might miss a subsequent packet. Or, if the transmitter waits a long time between transmissions, there would be no throughput optimization.
Once again, we used the arb editor to cut and paste command packets together. We played around with the packet spacing and combinations without having to change the software and program a new EPROM each time. Figure 5 shows a combination of packets "pasted" together in BenchLink.
Another Bug Uncovered While performing this test, we discovered another software bug. When the arb was left in continuous replay of these command-packet combinations, the software wasn't always processing the second packet correctly. The error only happened occasionally, and it was quite puzzling. After connecting a few more CPU lines to the MSO's digital input channels, we found the problem. An interrupt was being serviced periodically just before the second packet, and it was temporarily locking out reception until the end of the interrupt.A brainstorming session resulted in a wide variety of different error modes that could be introduced into the received signal. Now, whenever there's a need to verify new software revisions to ensure that bugs haven't been introduced, we simply run through the library of arb tests as a first step. HP's 33120A contains several nonvolatile memory buffers for storing data-packet waveforms. And, it can be used in the field without a computer and GPIB interface.