Real Time Statistical Process Control enables the operators running the process to know when action is required, as described in
In this second article we will analyse some of the Out-Of-Control situations we can detect and the corresponding corrective actions required.
You can download an Excel file with these examples:
Individuals and Moving Range SPC charts
In sheet A we have collected process yield data every hour during several shifts.
Since we collect a single value each hour we are using the pair:
- chart for individuals
- moving range chart
The individuals chart has the actual values and the MR chart the absolute values of the change between two consecutive values.
In both charts we automatically apply the Western Electric rules for Out-Of-Control symptoms and the alert will appear in the corresponding column: N - W for the individuals chart and Y - AH for the MR chart.
When any of these alerts appear the corresponding point in the chart will be colored:
- In red: upward symptom
- In green: downward symptom
Looking at the charts above, the individuals chart shows green dots from 10:00 to 16:00 and red dots between 1:00 and 3:00 during the night. We can notice a jump from 16:00 to 17:00 which might indicate the moment when the process has changed but this jump is not significant enough to declare it as a process change.
It is not until 1:00 that we have enough evidence to declare that the process has changed.
In this case, since we are measuring yield, an increase in yield is an improvement.
So we can say that process yield has improved sometime between 16:00 and 1:00. With this information we should now investigate what happened during this period of time that caused this improvement.
How much have we improved? If we compute the average between 6:00 and 16:00 we get an average yield of 93. And from 17:00 to 11:00 the next day 118. So we can estimate an average yield increase of 25.
Process Improvement Symptoms
Let's now analyse the alerts in our sheet A:
The -4 values in the individuals chart indicate 4 out of 5 consecutive values 1 sigma below the center line: values significantly below the average. The value -9 at 14:00 means that the last 9 values were below the center line: same interpretation.
The values 9 starting at 1:00 mean that from 17:00 to 3:00 all values have been above the center line: yield has significantly increased.
Finally, value 1 in moving range indicates that a significant change has occurred in the yield at 5:00. Looking at the individuals chart we see it corresponds to an increase.
Let's interpret the charts in sheet B:
We notice that there is an upward trend in yield as shown by the green dots up to 6:00 in day 11 followed by red dots starting at 12:00 until the end.
The alert in the MR chart on 4:00 corresponds to a significant one time decrease in yield so we can interpret it as a false alarm.
The interpretation of this situation is that we have a continuous process improvement starting around 11:00 on day 11 maybe due to some ongoing improvement actions.
Let's now look at the alerts:
They confirm a significant yield increase starting around 12:00 of day 11.
We now analyse sheet C:
An increase in yield variation is visible in the individuals chart but no alarms are visible.
The MR chart, on the other hand, is meant to detect this variation, and so it does.
Variation is significantly below the center line, as shown by the green dots before 20:00 and a significant increase is shown by the red dots starting at 3:00 on day 11.
If we only measure averages, as we often do, this problem would remain undetected.
Variation is always a bad thing for our process, so we can say our process has got worse starting around the end of day 10.
Yield variation will eventually cause inefficiencies, late deliveries and accumulation of WIP so this problem needs to be addressed before these side effects start to appear.
Day of the Week Effect
Let's now analyse the example in sheet D:
Here we are collecting daily production data and we notice that our average production is 400 and our charts are not detecting any significant changes in the process.
On the other hand we can see a repetitive cycle of 7 days which, of course, corresponds to a week.
If we look in our calendar the days of lowest production we realize they correspond to Sundays and the next lowest to Saturdays.
This is telling us that maybe there is significant seasonal trend within each week.
In order to check that we have added column AT to our spreadsheet where we calculate the day of the week corresponding to each date with this formula:
Where A3 holds the date and "2" is to start counting the week on Monday.
We can now do an ANOVA of our Production Vs Day of the Week with Minitab:
We can now confirm that there are significant differences among the different days of the week: Wednesdays and Fridays we have the highest productions. On Thursdays, for some reason, it is significantly lower and, of course, it is lowest on Saturdays and Sundays.
Shift Change Effect
We have been suspecting that we have a drop in production at the start of a shift, so we have collected data of hourly production in sheet E to check this effect. These are the resulting charts:
In this case, just like in the previous one, there is no indication of Out-Of-Control but we can see peaks in the MR chart which correspond to the start of each shift.
To see how significant are these differences we add a column with the hour of the shift: 6:00, 14:00 and 22:00 are hour 1 of the shift.
Now we do an Anova in Minitab of Production Vs Shift Time:
We notice that production in the last hour of the shift is significantly higher than the average and in the first hour it is significantly lower.
We would need to analyse the root cause of this effect which represents an inefficiency in our process.
One possible cause may be to try to ship as much product as possible at the end of the shift in order to meet the target. And, to do that, operators are moved to the end of the line. The side effect is that we leave an empty line for the next shift so some operators will have no work until the line fills up.
Real Time SPC is a useful tool for operators in order to control their process by making the required adjustments as soon as needed but without over-reacting.
Control charts are useless if analysis and corrective actions are not done in real time.
Operators need to be trained to do this analysis and to know what corrective action needs to be done and when.
Statistical Process Control is used to detect when a significant change has taken place in a process.
All processes have variation as observed in any of their metrics but not all variation is significant from the statistical point of view.
When we want to control a process by adjusting some process parameter we can make two mistakes:
- Over-react by adjusting when we shouldn't
- Under-react by failing to adjust when we should
A case of over-reaction is illustrated by the following example:
Someone is shooting at a target and based on the deviation of the impacts he adjusts the gun site after each shot.
The end result will be an increase of the dispersion of the impacts, therefore the adjustments will make the process worse.
The correct way is, of course, to fire 5 or 6 shots without adjustments and then decide if adjustment is required based on the center of the impacts.
This is an example of under-reaction:
If you drop a live frog in boiling water it will immediately jump out to save its life.
But if you put it in a pot of cold water and heat it the frog will eventually pass out without any reaction.
Many companies have fallen into this trap:
- A big disaster generates a quick and effective reaction and the company recovers
- A slow degradation of their KPIs such as customer sat pass undetected until it is too late
Statistical Process Control
To avoid these two mistakes we can use SPC to control a metric to find out if the observed changes are SIGNIFICANT from the statistical point of view.
We will illustrate this with the use of an example: control your own weight to check if your diet is leading to your weight target or not.
You can download this example Excel file and replace the data in columns A and B with your own data:
You can decide the frequency of your data collection. In this example I have collected it daily.
It is important to collect it at the same moment (more or less) each day.
You can either have your Excel file in your smartphone or in the cloud (Google Drive or Microsoft's One Drive)
I am assuming our target is to reduce weight, therefore a downward trend will put the numbers in green and an upward trend will put them in red.
SPC uses some rules, developed by the Western Electric company, to detect symptoms of SIGNIFICANT trends . They are shown below with a screen of Minitab SPC for individuals:
The center line is the average of all values and the standard deviation is also estimated by all the values.
In our Excel file upward trend symptoms are in red and downward in green.
The Excel file applies rules 1, 2, 3, 5, and 6 of this list.
Weight data interpretation
On 2/4/2018 weight 78.8 is in red and we see the explanation with a number 2 in column N which means that two values (1/4/2018 and 2/4/2018) with 78.8 Kg are 2 σ above the average. This is passed data. What really interests us is the data we have just collected.
The numbers start getting green on 8/4 with 78 kg and the reason is that out of the 5 last numbers the last 4 are 1 σ below the average. This is interpreted as a SIGNIFICANT downward trend. This trend continues the following days.
Statistically significant trend doesn't mean that this reduction is acceptable but at least we are moving in the right direction.
Results analysis with Minitab
In our Excel file there is no graphic: we just want to alert of significant trends
Looking at this same data in Minitab we can see an SPC chart:
The red dots indicate trends. looking at the graph we notice the first two are upward and the last three downward so the overall trend is downward.
There is a common misconception about the meaning of process stability
For instance some might call the process characterized by the following data unstable:
But if we look at our SPC:
It doesn't give us any alert of unstability
This is a STABLE process consisting simply of throwing dice. Since we have always thrown dice the same way SPC is telling us that this is a stable process: no SIGNIFICANT change has been detected.
Stability is not always a good thing. In the case of our weight control stability would mean that there is no improvement.
We want stability when the process is OK. If we improve this will show with SPC alerts in the right direction.
Use for other processes
This Excel file can be used to control a variable other than weight at work. For instance:
In this case we are doing an hourly control of our process yield. In this case Yield increase is good so Red is Good.
At 20:00 we have enough evidence of a Yield increase although the change seems to have taken place at 16:00.
SPC enables the right detection of process change.
High process intrinsic variation can lead to over-reaction (adjust when you shouldn't) or under-reaction (not reacting when you should)
Part 2: Real-Time SPC Analysis
Optimize both Yield and Cost with Minitab Design of Experiments
Design of Experiments is a powerful technique to optimize process outputs in complex processes where multiple factors affect the outputs. When we want to optimize more than one output the problem becomes more complex.
With this example we want to follow the steps, using Minitab, to optimize both process yield and cost by acting on 3 critical factors which have been identified by previous experiments.
These are the factors and levels we want to experiment with:
We will start with a full factorial experiment with 2 central points:
The resulting design is:
We are going to use a simulator in Excel to perform the experiments which you can download: Yield & Cost Simulator
We will just copy columns C5 - C7 from Minitab to the corresponding factor columns in the simulator:
We now copy the results of the experiments (columns Yield and Cost) back to the Minitab sheet in C8 - C9:
Now we start to analyze these results of our factorial experiment with Minitab:
These are the results for Yield:
This is telling us the factorial model is not valid: R-sq(adj) is very low (9,55%). This is the proportion of variability explained by this factorial mathematical model.
If we look at the p-values none of the factors or their interactions are significant.
We suspect curvature in spite of its p-value being slightly above 0,05.
To confirm curvature we can look at the graphs of main effects both for Yield and Cost:
This, clearly, shows curvature both for Yield and Cost because the Center points are totally out of line with the Corner points.
To further confirm curvature we repeat the factorial analysis including central points:
By including the central points in the model R-sq(adj) has improved a lot (97,97%) so this confirms we need quadratic terms in the formula but these can not be calculated with a factorial experiment. We need a Response Surface experiment to model this process.
Response Surface Experiment
We design a Response Surface experiment with the default values for 3 factors
But be aware that we must define our factor values as Axial points to avoid physically impossible values such as a negative pH.
The resulting response surface design is:
To run the experiments we copy columns C5-C7 to our simulator:
And back the results of the experiments (Yield and Cost) to the new Minitab worksheet:
We now perform the analysis of the Response Surface Design:
With the following results for Yield:
R-sq(adj) value of 96,85% confirms the validity of this model. Also all the p-values below 0.05. Only the Temperature*Time interaction and Temperature^2 are not significant. Before we remove these from the model we must make sure they do not significantly affect the other output variable we try to optimize:
Now we confirm that both Temperature^2 and Temperature*Time do not,significantly, affect Yield or Cost.
Therefore we will remove both from the model:
The resulting reduced model for Yield is then:
This is the formula to estimate Yield as a function of Temperature, Time and pH.
The reduced model for Cost is:
We can now see graphically the impact of the 3 factors in Yield with Contour Plots:
And also for Cost:
We can, graphically, see the optimal areas to maximize Yield and to minimize Cost. Unfortunately these areas do not coincide.
Now we will let Minitab calculate the optimal factor values to achieve our overall optimization:
We have decided to assign double weight to Yield Vs Cost. The result is:
This gives us the optimal factor settings to obtain a predicted Yield of 156 and a Cost of 97.
The following interactive graph lets us touch up the factors and see the results of both outputs:
The optimal factor values have been obtained by maximizing the Composite Desirability curves which are based on our condition that Yield is twice as important as Cost.
If we wanted to maximize Yield at any Cost we would obtain:
This improves Yield but it also increases cost.
With this example you can experience the power of Minitab to design and analyze experiments in order to optimize several outputs by calculating the optimal factor settings
In spite of the high-tech of our shop floors you can still see pencil and paper data collection being used.
Data collection in the shop floor is a non-value-add operation which is still required to control the process.
This collection is often automatically done by the manufacturing equipment with no operator intervention.
Unfortunately, detecting aspect defects is often difficult to automate even with AOI (Automatic Optical Inspection).
When an operator is required to inspect we need a method to enter the results into the system.
Line incidents, change of critical process parameters, etc. also need to be reported.
The method still widely used in this 21st century is:
- The operator hand writes the defect data on a form
- Forms are collected at the end of the shift
- A clerk types the forms contents into the system at a later stage (next day or later)
- The engineer analyses the results and takes corrective actions
This reporting process creates a number of problems:
- It is time consuming for the operator
- It adds a non-value-add operation: transcribing the data
- The operator’s handwriting is not necessarily good (this criteria is not used for operator selection)
- Reporting errors may be introduced both by the operator and the clerk doing the transcription
- Some of these errors may be impossible to recover when they are detected on transcription
- Real time process control is impossible: the engineer detects a problem at least 24h after it happened.
- Maybe they have been producing bad product all this time.
The alternative to the pencil & paper data collection is, of course, directly reporting defects in a terminal the moment they happen.
You can provide terminals in the inspection and test stations to do that.
Sometimes it is difficult to provide individual terminals to all the operators who need to report. Sharing terminals and having to walk to the nearest terminal is time consuming so it may not be acceptable.
The reporting operation should not be time consuming and error prone. If we ask operators to type 12 digit numbers or time stamps the chances of error are very high.
So we need a reporting system which is both simple and error free.
Large number typing can be replaced by a bar code reader whenever possible. We can also use the data validation alternatives provided by Excel:
In this way the reporting errors may be detected the moment they are entered and corrected on the spot.
Pull down menus may be used for each field to avoid typing and insure all use the same wording.
The current time stamp can be automatically provided with a formula: =now()
Reporting on the cloud
Having a terminal for each operator who needs reporting may not be possible, in which case reporting on the cloud may be an alternative.
The data base can be created in a Google Drive spreadsheet:
Reporting may be done with any tablet or smart phone via WIFI
BYOD (Bring Your Own Device) alternative
Operators may have difficulties with readable hand writing or reporting in a terminal but they are normally experts in typing into their own smart phones. This existing skill can prove to be very useful for real-time on-line reporting.
The database may be in Internet or the company Intranet.
The smart phones used may be the operator’s own device or one supplied by the company.
It, obviously, requires appropriate firewalls to avoid entering virus.
Even a smart phone with no telephone line may be used to access via WIFI (an old out-of-use smart phone might do the job).
To avoid corrupting the database each workplace may have a dedicated input worksheet in the cloud and then data is automatically transferred to a central worksheet. Operators need write access to their input worksheet and only read access to the central worksheet and reports.
Real-time feedback to the workplace
Real-time data collection enables real-time feedback to the operators in order to take corrective actions while it is still possible.
This Andon display provides real-time feedback:
- PTH component insertion DPMO (red means above 1500 target)
- PTH component solder DPMO (green: below 190 target)
- Processed cards/hour.
SPC (Statistical Process Control) Charts is another alternative to provide feedback to the operators in real-time.
When this feedback is in real-time it is possible to relate changes in the output to what just happened in the line: change in process parameters, line incidents, etc.
This makes it possible to get to the root cause of problems and often solve them on the spot.
Pencil and paper data collection makes it impossible to provide this real-time feedback: all we have is after-the-fact compound data. In this case it will be difficult to find the root cause of the problem and therefore the problem will often remain unresolved. All we can do in this situation is guess what happened and try to find a justification for management.
With this simple value stream to produce products A and B we want to illustrate the way to maximize profit taking into account the system constraints.
· Each product is made by assembling two subassemblies, one of them common to both products.
· We have three special purpose workstations with a corresponding operator trained for the specific job.
· The line operates one shift with 2400 productive minutes per week.
· The fixed costs, which include the salaries of the three operators, are 8000 € per week.
· Product A is sold for 90€ and B for 100€.
· The maximum weekly market demand is 200 units for A and 100 for B
· Total direct costs (materials) for product A is 50€ and for B 40€ as shown above.
· We want to know how many products A and how many B we should produce to maximize profit.
First of all let’s look at the system constraints:
- Time available per week for each of the three operators is 2400 minutes
- Weekly market constraints: 200 A and 100 B.
We want to maximize overall profit so we calculate the margin of each product:
Since we can’t satisfy all the market demand we must decide which product should be our priority. If we ask Finance they will tell us B is the best, since it has the highest margin.
Let’s see how many resources we will need to produce all the market demand for B and how many A’s we will be able to produce with the remaining resources:
With the remaining manufacturing capacity we would be able to produce 140 A’s and assemble 380 A’s but we can only test 40 A’s. It makes no sense to manufacture more A’s than we can test. The excess assembly capacity will be wasted because we can only assemble the units which have been tested. This means that in fact we will only be able to produce 100 B’s and 40 A’s. This gives us a total loss per week of 400€.
Maybe the Finance criteria of giving priority to the product with the highest margin was not so good after all.
We have already noticed that the bottleneck in this line is Test: it is limiting the effective capacity of the whole line.
The use of bottleneck time is therefore a critical factor we should consider in optimizing the line.
We need to ask ourselves how much bottleneck time is consumed to obtain a 40€ margin in product A, and the same for B.
This means that each minute of test time (bottleneck) if used to test product A gives us a margin of 4€ while B only gives us 3€.
So when we take into consideration the existing constraints A is more profitable. Let’s give priority to A and use the remaining capacity to produce B:
So the bottleneck plays an important role in finding the optimal solution: it defines the effective capacity of the whole line, therefore we need to optimize the use of bottleneck time.
We can clearly see that due to the existing constraints this line is very inefficient. Let’s look at the capacity utilization of each operation:
The only operation with 100% utilization is test. This is what we would expect since it is the bottleneck. We see that final assembly has very low utilization (less than 50%).
We can see that the only way to improve further the productivity is to reduce test time by making it more efficient or to transfer some work (if possible) to the assembly operation.
Let’s now look at the problem graphically:
The red area is the forbidden area due to the constraints: maximum A, maximum B and the constraint of the 2400 minutes available at the bottleneck (test): each A consumes 10 minutes of test and each B 20 minutes.
We now look at the point of maximum profit: It will be the furthest point from the origin within the allowed area (green) which is the point of maximum A (200 units) and 20 units of B.
Solution with Solver
Now let’s see how to solve the problem with Microsoft Excel Solver. First we need to set all the data in a spread sheet:
Download file: Solver1.xlsx
In matrix B3:D4 we have the unit times for each operation in products A and B.
Column E has material costs and F product selling price. In column G (margin) we need a formula which just subtracts F – E.
In column H (Produced) we want Solver to place the results we are looking for: number of A’s and B’s to produce.
Column I has the market constraints for A and B.
Row 5 has the formulas above to calculate time consumed on each operation based on the products produced. These values we will compare with row 6 (available times for each operation)
Finally B10 has the shown formula to calculate the profit. This is the cell we will ask Solver to maximize.
Now we go to Solver to setup the parameters as shown:
We want Profit (B10) to maximize changing cells H3:H4 where Solver will place the solution we are looking for.
Now the constraints:
- B5:D5 ≤ B6:D6 Used times ≤ Available times
- H3:H4 ≤ I3:I4 Don’t produce above market demand
- H3:H4 Must be integers and not negative
Press resolve and we will get the results:
Now imagine that an engineer has found a way to reduce test time of product A from a total of 10 minutes to 9 minutes by adding some work to the assembly operation which would involve an increase from the previous time of 5 minutes to 7.
This suggestion would be difficult to justify with Finance, since we are increasing the overall operator time of product A one minute.
But let us look at our Excel to see what Solver has to say:
We have been able to produce 10 more B’s increasing the profit from 1200€ to 1800€.
This is telling us that maybe the way we compute product profit may not help when we have idle time in some of the operations.
Container distribution example
Now let’s see another example of Theory of Constraints solution with Solver:
In this case we want to minimize the cost of distribution of goods containers from warehouses X, Y, Z to shops A, B, C, D, E, F:
Download file: Solver2.xlsx
Matrix C4:H6 contains the transport costs of one container from each warehouse to each shop.
Matrix C9:H11 will contain the number of containers from each warehouse to each shop we are looking for.
The blue cells have the formulas shown.
The Solver parameters will be:
Notice we must meet all the shops requirements: C12:H12 = C13:H13
And the resulting solution:
This is the optimal solution which meets all the constraints.
- All Value Streams are subject to multiple constraints such as:
- Market demand
- The optimal operating point very much depends on the existing constraints which may vary along time.
- Microsoft Excel Solver is useful to optimize a Value Stream which is subject to multiple constraints.