Skip to main content
Case Study: The Broken Link (JTAG & Boundary Scan)

Case Study: The Broken Link

How JTAG and Boundary Scan Test the "Untestable" Board

1. Introduction: The "It-Works-Alone" Problem

Our DFT journey so far has been inside the chip. We've used Scan, MBIST, and At-Speed tests to gain 99.9% confidence that the *silicon die itself* is perfect. This "perfect" chip is then packaged (e.g., in a Ball Grid Array (BGA)) and soldered to a Printed Circuit Board (PCB) alongside other chips (like DRAM, a Wi-Fi module, etc.).

The system is assembled, powered on, and... it's dead.

We have a new, critical problem:

  • The CPU chip is perfect.
  • The DRAM chip is perfect.
  • But the connection between them is broken.

A single solder ball under the BGA package didn't melt correctly, or a tiny trace on the PCB has a hairline crack. This is a board-level manufacturing defect. How do we find it?

2. The Design Under Test (DUT): A System-on-Board

Our DUT is no longer a single chip, but a *system*:

  1. Our Chip: A CPU with a 32-bit data bus.
  2. Another Chip: A DRAM controller.
  3. The Board: A PCB that connects them with 32 tiny traces.

The "defect" is a single "open" on the DATA[17] solder ball.

3. The "Old" (Failed) Solution: The Bed-of-Nails Tester

In the 1980s, you would test this with a "bed-of-nails" fixture. This was a giant press with thousands of tiny, spring-loaded pins ("pogos") that would physically touch test pads on the PCB.

  • Problem 1: No Access. Our DATA[17] trace is on an *inner layer* of the PCB, and the pin is a *solder ball* hidden under the chip. There is no physical way for a "nail" to touch it.
  • Problem 2: Density. Modern boards have thousands of connections. A bed-of-nails tester would be astronomically complex and expensive.
  • Conclusion: This method is physically impossible for modern electronics.

4. The DFT Solution: IEEE 1149.1 (JTAG) Boundary Scan

The solution, standardized as IEEE 1149.1, is to build the "bed of nails" *inside the chip itself*. This is Boundary Scan.

Here's what we add:

  1. The TAP Controller: This is the "brain." It's a small state machine controlled by 4 (or 5) special pins: TDI (Test Data In), TDO (Test Data Out), TCK (Test Clock), and TMS (Test Mode Select). This 4-pin interface is the Test Access Port (TAP), commonly known as JTAG.
  2. Boundary Scan Cells (BSC): We add one special flop-and-MUX "cell" *right next to* every single I/O pin of the chip.
  3. The Boundary Scan Register (BSR): We chain all these individual BSCs together into one, long scan chain that loops from TDI to TDO.

This BSR chain creates a "virtual" wall that isolates the chip's *internal logic* from its *external pins*. We can now control the pins *directly* from the JTAG port, like a puppet master.

5. The Test Procedure: `EXTEST` (External Test)

Here is how we find our broken DATA[17] solder ball.

  1. Chain the Board: The ATE (or board tester) connects to the JTAG ports of *all* JTAG-compliant chips on the board, daisy-chaining them: ATE -> CPU-TDI -> CPU-TDO -> DRAM-TDI -> DRAM-TDO -> ATE.
  2. Enter EXTEST: The tester uses the TMS/TCK pins to "talk" to all TAP controllers at once, issuing the EXTEST instruction.
  3. Isolate: This instruction configures all the BSC MUXes to *disconnect* the chips' internal logic. The CPU's core is now "off" and the pins are controlled *only* by the boundary scan flops.
  4. Scan-In: The tester shifts a test pattern (e.g., ...0101...) into the *entire* board's chain. The bits for the CPU's *output* pins are loaded.
  5. Update: The tester "updates" the BSRs. The CPU's boundary scan cells now *drive* this ...0101... pattern *out* of the CPU's pins and onto the PCB traces.
  6. Capture: The tester pulses TCK one more time. The DRAM's boundary scan cells (which are in "input" mode) *capture* the values arriving on its pins.
  7. Scan-Out: The tester shifts the *entire* chain's contents out and reads the data that was captured by the DRAM.

The "Moment of Truth":

  • Expected (at DRAM): ...0101... on its data bus.
  • Actual (at DRAM): ...01Z1... (or ...0101... if Z floated low). The value on DATA[17] is *wrong* because the "open" solder ball broke the connection.
  • Diagnosis: The tester compares the expected and actual patterns and instantly pinpoints the fault: "Failure between CPU pin G12 (DATA[17]) and DRAM pin B4 (DATA[17])."

The board can now be sent for precise repair (re-soldering or "reballing" the CPU).

6. Final Analysis: The Payoff

Boundary Scan is the *only* technology that can solve this board-level test problem.

Test Strategy Find Internal Faults? Find Board-Level Faults? Required Access Cost/Complexity
No DFT No No N/A Total Failure
Bed of Nails No Yes (on old chips) Needs 100% physical access Astronomically High
Internal Scan Yes No 4-pin JTAG port Low
Boundary Scan Yes (via INTEST) Yes (via EXTEST) 4-pin JTAG port Low

The Payoff is Diagnostics:

  • Without Boundary Scan, a failing board is a "black box." We have no idea why it's failing. The entire $1000 board is thrown in the trash.
  • With Boundary Scan, we get a "Google Maps" for the failure, pointing to the exact broken trace or solder joint. The $1000 board is repaired for $5.
  • Bonus: The JTAG TAP is the "front door" we use to access *all other* DFT (Internal Scan, MBIST, OCCs, etc.).

7. Post-Silicon Validation: The Moment of Truth

The first chip arrives. How do we test our *Boundary Scan* logic?

  1. Check the TAP: The first thing an ATE does is try to talk to the TAP controller. It issues a command to read the chip's IDCODE. If the chip responds with its correct 32-bit ID (e.g., 0xDEADBEEF), we know the JTAG pins and the core TAP logic are working.
  2. Run `BYPASS`: We issue the BYPASS instruction. This shrinks the chip's scan chain to a *single flop*. We scan 1 bit in and check that it comes out 1 clock later. This confirms the bypass path.
  3. Test the BSR: We run a test that scans a 0101... pattern into the full Boundary Scan Register (BSR) and scans it right back out, *without* capturing or updating. This confirms the BSR chain *itself* isn't broken.
  4. Run `INTEST`: We can use INTEST (Internal Test) to use the BSR to drive values *into* our own chip's logic and capture the result. This confirms the BSR's connection to the internal logic.

Only after all this is confirmed can we trust EXTEST to find board-level faults.

8. Conclusion

This case study shows that DFT isn't just about testing the *inside* of the chip; it's about testing the chip's *place in the world*.

  • Why Boundary Scan? To test the connections *between* chips on a board, which are physically impossible to probe.
  • How? By creating a "virtual," scannable I/O ring around the chip's logic, controlled by the JTAG TAP.
  • The Result: We can find and diagnose board-level "open" and "short" defects with pinpoint accuracy, saving millions in manufacturing and repair costs.

Comments