A Test/ Repair loop can become a bottleneck for the whole Value Stream when the test First Pass Yield is lower than planned.

A drop in test FPY is normally caused by problems upstream in the Value Stream. This Test/ Repair loop is often absent in Value Stream Maps in spite of the potential to become the bottleneck for the total process. 

You can download this example file:  TestRepair.xlsm

This Excel file simulates a Test/ Repair loop such as:

Ideal Situation

The ideal case is when FPY = 100%. In this case no repair is necessary and the required test capacity is just 100, which corresponds to the Value Stream throughput.

You must close all open Excel files before you open this one and you should enable Macros.

To simulate 1 hour operation just press F9. Keep pressing to see the evolution.

To reset (put all Work-In-Process to zero) press Ctrl and r keys.

You can only write in the yellow cells.

You need to work out the minimum required Test and Repair capacities in order to deliver 100 units each hour (balanced loop).

Test FPY Drop

Let's assume 1% of the units are failing test: Test FPY = 99%:

The immediate result is that the output of this loop drops to 99: It has become the bottleneck of the Value Stream.

Since we are not repairing the faulty units they are accumulating in front of the Repair station. 

How much repair capacity do we need in this case?

All repaired items need to be retested so they will add to the new items entering the loop. What test capacity do we need then?

We can calculate this without the need for simulation:

In this case when we test 100 units only 99 come out OK (we have multiplied 100 by FPY which is 0.99). 

If we need an output of 100 items OK how many do we need to test? 

Answer: We divide by the FPY:  100/ 0.99 = 101.1 units per hour

And the repair capacity will be: 101.1 - 100 = 1.1 units per hour

 Repair Time Variation

Test time typically doesn't have much variation, specially if it is automatic. Repair time, on the other hand, tends to have high variability. Some automatic testers may provide specific repair instructions but very often repair requires an investigation which might take a long time. 

Let us assume that in our example an average repair capacity of 1.1 items/ hour has a standard deviation of 0.5:

The frequency distribution of this repair capacity will be: 

And the result will be:

We see that occasionally output will drop and a queue will develop waiting for repair. This means we will need additional average repair capacity to compensate for this variation:

Now the "waiting for repair" queue has moved to "waiting for test" but output is maintained in 100. 


The Test/ Repair loops should normally be included in the Value Stream Map because they are critical steps in the process which could become the bottleneck of the Value Stream. See an example in:


A drop in Test FPY should trigger immediate actions to analyse and correct the source of the defects.

In the mean time we will need additional Test and Repair capacity to maintain the VSM throughput.

The Repair operation may require high skills in short supply so enough training should be provided to insure it doesn't become the bottleneck for the total process. 

By developing a robust Repair process we can reduce the average repair time and its standard deviation reducing this risk.