Optimize both Yield and Cost with Minitab Design of Experiments
Design of Experiments is a powerful technique to optimize process outputs in complex processes where multiple factors affect the outputs. When we want to optimize more than one output the problem becomes more complex.
With this example we want to follow the steps, using Minitab, to optimize both process yield and cost by acting on 3 critical factors which have been identified by previous experiments.
These are the factors and levels we want to experiment with:
We will start with a full factorial experiment with 2 central points:
The resulting design is:
We are going to use a simulator in Excel to perform the experiments which you can download: Yield & Cost Simulator
We will just copy columns C5 - C7 from Minitab to the corresponding factor columns in the simulator:
We now copy the results of the experiments (columns Yield and Cost) back to the Minitab sheet in C8 - C9:
Now we start to analyze these results of our factorial experiment with Minitab:
These are the results for Yield:
This is telling us the factorial model is not valid: R-sq(adj) is very low (9,55%). This is the proportion of variability explained by this factorial mathematical model.
If we look at the p-values none of the factors or their interactions are significant.
We suspect curvature in spite of its p-value being slightly above 0,05.
To confirm curvature we can look at the graphs of main effects both for Yield and Cost:
This, clearly, shows curvature both for Yield and Cost because the Center points are totally out of line with the Corner points.
To further confirm curvature we repeat the factorial analysis including central points:
By including the central points in the model R-sq(adj) has improved a lot (97,97%) so this confirms we need quadratic terms in the formula but these can not be calculated with a factorial experiment. We need a Response Surface experiment to model this process.
Response Surface Experiment
We design a Response Surface experiment with the default values for 3 factors
But be aware that we must define our factor values as Axial points to avoid physically impossible values such as a negative pH.
The resulting response surface design is:
To run the experiments we copy columns C5-C7 to our simulator:
And back the results of the experiments (Yield and Cost) to the new Minitab worksheet:
We now perform the analysis of the Response Surface Design:
With the following results for Yield:
R-sq(adj) value of 96,85% confirms the validity of this model. Also all the p-values below 0.05. Only the Temperature*Time interaction and Temperature^2 are not significant. Before we remove these from the model we must make sure they do not significantly affect the other output variable we try to optimize:
Now we confirm that both Temperature^2 and Temperature*Time do not,significantly, affect Yield or Cost.
Therefore we will remove both from the model:
The resulting reduced model for Yield is then:
This is the formula to estimate Yield as a function of Temperature, Time and pH.
The reduced model for Cost is:
We can now see graphically the impact of the 3 factors in Yield with Contour Plots:
And also for Cost:
We can, graphically, see the optimal areas to maximize Yield and to minimize Cost. Unfortunately these areas do not coincide.
Now we will let Minitab calculate the optimal factor values to achieve our overall optimization:
We have decided to assign double weight to Yield Vs Cost. The result is:
This gives us the optimal factor settings to obtain a predicted Yield of 156 and a Cost of 97.
The following interactive graph lets us touch up the factors and see the results of both outputs:
The optimal factor values have been obtained by maximizing the Composite Desirability curves which are based on our condition that Yield is twice as important as Cost.
If we wanted to maximize Yield at any Cost we would obtain:
This improves Yield but it also increases cost.
With this example you can experience the power of Minitab to design and analyze experiments in order to optimize several outputs by calculating the optimal factor settings
With this simple value stream to produce products A and B we want to illustrate the way to maximize profit taking into account the system constraints.
· Each product is made by assembling two subassemblies, one of them common to both products.
· We have three special purpose workstations with a corresponding operator trained for the specific job.
· The line operates one shift with 2400 productive minutes per week.
· The fixed costs, which include the salaries of the three operators, are 8000 € per week.
· Product A is sold for 90€ and B for 100€.
· The maximum weekly market demand is 200 units for A and 100 for B
· Total direct costs (materials) for product A is 50€ and for B 40€ as shown above.
· We want to know how many products A and how many B we should produce to maximize profit.
First of all let’s look at the system constraints:
- Time available per week for each of the three operators is 2400 minutes
- Weekly market constraints: 200 A and 100 B.
We want to maximize overall profit so we calculate the margin of each product:
Since we can’t satisfy all the market demand we must decide which product should be our priority. If we ask Finance they will tell us B is the best, since it has the highest margin.
Let’s see how many resources we will need to produce all the market demand for B and how many A’s we will be able to produce with the remaining resources:
With the remaining manufacturing capacity we would be able to produce 140 A’s and assemble 380 A’s but we can only test 40 A’s. It makes no sense to manufacture more A’s than we can test. The excess assembly capacity will be wasted because we can only assemble the units which have been tested. This means that in fact we will only be able to produce 100 B’s and 40 A’s. This gives us a total loss per week of 400€.
Maybe the Finance criteria of giving priority to the product with the highest margin was not so good after all.
We have already noticed that the bottleneck in this line is Test: it is limiting the effective capacity of the whole line.
The use of bottleneck time is therefore a critical factor we should consider in optimizing the line.
We need to ask ourselves how much bottleneck time is consumed to obtain a 40€ margin in product A, and the same for B.
This means that each minute of test time (bottleneck) if used to test product A gives us a margin of 4€ while B only gives us 3€.
So when we take into consideration the existing constraints A is more profitable. Let’s give priority to A and use the remaining capacity to produce B:
So the bottleneck plays an important role in finding the optimal solution: it defines the effective capacity of the whole line, therefore we need to optimize the use of bottleneck time.
We can clearly see that due to the existing constraints this line is very inefficient. Let’s look at the capacity utilization of each operation:
The only operation with 100% utilization is test. This is what we would expect since it is the bottleneck. We see that final assembly has very low utilization (less than 50%).
We can see that the only way to improve further the productivity is to reduce test time by making it more efficient or to transfer some work (if possible) to the assembly operation.
Let’s now look at the problem graphically:
The red area is the forbidden area due to the constraints: maximum A, maximum B and the constraint of the 2400 minutes available at the bottleneck (test): each A consumes 10 minutes of test and each B 20 minutes.
We now look at the point of maximum profit: It will be the furthest point from the origin within the allowed area (green) which is the point of maximum A (200 units) and 20 units of B.
Solution with Solver
Now let’s see how to solve the problem with Microsoft Excel Solver. First we need to set all the data in a spread sheet:
Download file: Solver1.xlsx
In matrix B3:D4 we have the unit times for each operation in products A and B.
Column E has material costs and F product selling price. In column G (margin) we need a formula which just subtracts F – E.
In column H (Produced) we want Solver to place the results we are looking for: number of A’s and B’s to produce.
Column I has the market constraints for A and B.
Row 5 has the formulas above to calculate time consumed on each operation based on the products produced. These values we will compare with row 6 (available times for each operation)
Finally B10 has the shown formula to calculate the profit. This is the cell we will ask Solver to maximize.
Now we go to Solver to setup the parameters as shown:
We want Profit (B10) to maximize changing cells H3:H4 where Solver will place the solution we are looking for.
Now the constraints:
- B5:D5 ≤ B6:D6 Used times ≤ Available times
- H3:H4 ≤ I3:I4 Don’t produce above market demand
- H3:H4 Must be integers and not negative
Press resolve and we will get the results:
Now imagine that an engineer has found a way to reduce test time of product A from a total of 10 minutes to 9 minutes by adding some work to the assembly operation which would involve an increase from the previous time of 5 minutes to 7.
This suggestion would be difficult to justify with Finance, since we are increasing the overall operator time of product A one minute.
But let us look at our Excel to see what Solver has to say:
We have been able to produce 10 more B’s increasing the profit from 1200€ to 1800€.
This is telling us that maybe the way we compute product profit may not help when we have idle time in some of the operations.
Container distribution example
Now let’s see another example of Theory of Constraints solution with Solver:
In this case we want to minimize the cost of distribution of goods containers from warehouses X, Y, Z to shops A, B, C, D, E, F:
Download file: Solver2.xlsx
Matrix C4:H6 contains the transport costs of one container from each warehouse to each shop.
Matrix C9:H11 will contain the number of containers from each warehouse to each shop we are looking for.
The blue cells have the formulas shown.
The Solver parameters will be:
Notice we must meet all the shops requirements: C12:H12 = C13:H13
And the resulting solution:
This is the optimal solution which meets all the constraints.
- All Value Streams are subject to multiple constraints such as:
- Market demand
- The optimal operating point very much depends on the existing constraints which may vary along time.
- Microsoft Excel Solver is useful to optimize a Value Stream which is subject to multiple constraints.
In part 1: Excel Value Stream Map we saw how a complex process could be Value-Stream-Mapped using Excel. We saw the importance of mapping feedback loops such as test-repair loops because they often become the bottleneck for the complete process. In this situation not all items flow along all branches so we need to calculate the % flowing along each branch. This enabled us to calculate the cycle for each workstation based on the staffing. Finally we were able to calculate the ideal staffing to balance the line: insure no workstation cycle is above the average inter-arrival time (takt time). This we did using Solver.
Download this VSM simulator example: Value Stream Map and Modeling
A value stream map is not complete without Work-In-Process (WIP) between process steps. The problem with WIP data is that it typically varies widely along time so a snap shot of the moment we visit the line may be very misleading. In our VSM Simulator WIP before each workstation (column Q) is calculated each hour:
New WIP = Previous WIP + Items processed by previous step – Items processed by this step
Items processed (column R) are calculated each hour based on Average time (K), Time Standard deviation (L) and Staffing (M) taking into account the constraint of enough WIP in the previous step. The simulator is assuming that process times follow a normal distribution.
Process times: Normal distribution with averages (K) and standard deviations (L)
On the other hand Inter-arrival time (Takt) follows an exponential distribution:
Inter arrival time distribution: Exponential with average K3
Lead Time in column T, for each process step, is the average time it takes for an item to be processed in that step including the waiting time due to the WIP in front of it.
Capacity utilization in column U is the average % time the workstation has been operating since the start of simulation based on its theoretical capacity.
WIP (Q) and Capacity Utilization (U) are shown on bar charts on the right:
Elapsed hours simulated are in B2. Products received for repair, products repaired and products returned without repair are shown in column A:
Total values for columns K to U are shown in row 1:
K1 represents the average Value Add Time per product
M1 is the total staffing in the line
P1 is the maximum value of all cycle times: it corresponds to the line bottleneck capacity. P1 should be ≤ K3 (Takt)
Q1 is the total WIP in the line.
Received (A3) = WIP (Q1) + Repaired (A13) + Returned without repair (A8)
T1 is the Total Lead Time (hours): The average lead time seen by the customer (not including the customer contribution)
U1 is the overall line capacity utilization.
The Value add to total time rate can be calculated:
Value Add Time / TOTAL Time = K1 / (T1 x 60)
To run the simulator:
press F9 to simulate 1 hour. Keep pressing to run several hours. The counter in B2 shows the elapsed simulator hours.
To start simulation with an empty line: press Ctrl key + r. This will reset all line WIPs to zero.
In the tab Evolución you can see how all Processes and WIP values evolve along each hour in the main steps as well as the evolution of total Lead Time:
During the first hours, when the line is empty, there is no waiting time along the process so the total lead time is basically equal to the value add time of one hour. As queues start building up as shown in the WIP chart, waiting time starts to increase in all workstations and that causes the total lead time to increase to a value of 12 hours, where it seems to stabilize. This means that if we take a snap shot of the line in hour 7 basically all lead time is value add. On the other hand if we look at the line in hour 22, when stability seems to be reached, lead time is 12 times value add time: NVA/ VA = 11. Non value add time is made up of all the waiting taking place in front of each workstation.
We can see that, although the average inter-arrival time is 5 min (12 items per hour), 3 arrivals of more than 30 items each have collapsed receiving causing an accumulation of 80 items at reception which has eventually been absorbed by the line.
This is a typical case where most of the variability comes from the customer inter arrival times as seen in the process chart.
Tracking this evolution you can discover unexpected bottlenecks caused by variability that may need to be reinforced with additional staff. We can also estimate the total lead time we can commit to customers. In this case if the process is stable that would be 12 hours minimum.
These are some of the effects we can detect in our simulation:
Accumulation of WIP in front of a workstation due to insufficient staffing
Low capacity utilization due to excess staffing
Accumulation of WIP in front of the repair station due to low first-pass-yield in test (N11)
Accumulation of WIP in reception (Q4) due to arrivals above plan
High overall lead time due to excess total WIP in the line
Highly unbalanced line as shown by large differences in capacity utilization
Proportion of products under warranty different to plan: it would require line re-balancing
A static Value Stream Map is a good start to understand how the process behaves
Feedback loops such as test-repair loops are an essential part of the process so they should be in the VSM
A snap shot of the WIP situation along the line may not be representative of the normal operation
If the WIP situation is not typical the NVA calculation will not be correct
The VSM simulator provides a deeper understanding of the process behavior and it enables what-if studies to optimize the process
Simulation helps us understand some of the failure mechanisms caused by variability so we can act on the root cause to make real improvements
The quality level seen by the customer of our process is one of the most critical process metrics. This quality level is the result of all the defects generated by the different process steps as well as those coming from the parts suppliers. In our Value Stream Map we focus on process flow but this flow is very much affected by quality. It is important to find out where in the process each type of defect is generated and where in the process it will be detected to make sure it doesn’t find its way to the customer.
In this model you can download DPUdpmo.xlsx we are simulating a simplified map of an Electronic Assembly and Test Process.
Defect contributors are defective components coming from suppliers and defects from the 4 process steps:
- SMT (Surface Mount Technology) automatic component placement
- Reflow solder of these components
- Manual PTH (Pin Thru Hole) component insertion
- Wave soldering of PTH components.
Defects are detected on:
- Several visual inspections along the process
- ICT (In-Circuit Test) at the end of the assembly process
- Final Functional test.
“Defects per unit” (DPU) is the metric that interests the customer but it’s not very practical as a process metric because we are typically producing products with very different levels of complexity: some boards may have just 5 components and others 500.
“Defects Per Million Opportunities” (DPMO) is a useful process metric that is independent of the product complexity. This metric will allow us, for instance, to know if our soldering process is improving or getting worse with independence of the complexity of the products we are soldering.
In the following example: if we have one defective component in 3 boards the average DPU = 1/3 = 0.33
When we measure in DPMO we must define the OFEs (Opportunities For Error) of the process we are dealing with:
- Component placement process: OFE = Number of components placed
- Soldering process: OFE = Number of solder joints soldered
Therefore if we say our component placement DPMO = 100 this means that out of every million components placed 100 will be missing or misplaced.
If our wave solder DPMO =200 this means that out of every million solder joints 200 will be defective.
In our Electronic Assembly example we have 500 SMT components and 100 PTH therefore these are the corresponding OFEs. Total components used por board are therefore 600.
The 500 SMT components have a total of 1500 solder joints, therefore these are the reflow soder OFEs.
The 100 PTH componntes have a total of 300 solder joints (wave solder OFEs).
Process DPMOs define the quality of our different process steps and the OFEs are specific for each product type.
On each step we can estimate the expected defects per unit:
DPU = DPMO x OFE / 1 000 000
Since the types of defects contributed by the 5 different sources are independent, the total number of defects produced will be sum of each defect contribution. Total DPU is the sum of all DPUs.
Process Yield is the proportion of defect free units. If defects are independent DPU follows a Poisson distribution so Yield and DPU are related by:
Yield = exp(-DPU)
For instance an average of 1 defect per unit will still produce exp(-1) = 0.37 that is 37% defect free units (there will be units with more than one defect)
Total DPU is therefore the sum of DPUs while the total Yield is the product of yields.
Now let’s look at defect detection:
Control points (visual inspections and tests) are defined in the Control Plan for each of the product types. In this example we have several visual inspections, one ICT and one Functional Test performed in this order.
Visual inspection is performed as close as possible to the step generating the defect and the defects detected are typically corrected on the spot. The problem is that it is very inefficient: it can only catch 60% of the defects. In this example this means that out of the total 0.32 DPUs only 0.192 will be caught and corrected, therefore 0.128 will escape.
ICT uses a bed of nails to contact many points in the board and is able to test the functionality of some components. If the test coverage is 80%, 0.102 DPU will be caught in ICT and corrected and therefore 0.026 will escape.
Finally the functional test has a high test coverage but the test is not able to pinpoint where the defect is, therefore faulty boards need to be analysed and repaired by an expert through a lengthy process.
The proportion of boards that will need to be reworked as a result of visual inspection is 17%. This will require additional resources and skills. In the case of ICT and functional test 12% of boards will require rework and this may require a higher level of skills and it will take longer to identify and correct these defects.
At the end of this process we have 0.001 DPU going to the customer which means 1280 faulty boards out of every million.
This is obviously unacceptable so we can start looking at the main contributors to defects and the main escapes of our detection and correction.
We can look at different improvement alternatives with this simulator and estimate the effect on the final ppms going to the customer.
PTH insertion, being a manual operation, is the main contributor to defects. We can look at ways to add some Poka-Yokes (defect proof devices) or look at ways to automate. Suppose we are able the reduce the PTH insertion DPMO from 1500 to 500 the result would be a reduction of output to 1120 ppm.
If we can further improve visual inspection coverage by automation to 80% them it would go down to 560 ppm.
If we redesign the board to eliminate all PTH components and convert them to SMT we would further reduce it to 384 ppm.
In this simulator Process DPMOs come from measuring our current processes. Theoretically they should be Product independent but in actual fact they are not: some designs produce more manufacturing defects than others and the process parameters for some products may not be optimized.
This simulation helps us focus on the main contributors of defects and defect escapes from our tests. In this way we can identify who has to do what in order to improve the process.
Value Stream Maps are typically designed to flow horizontally left to right. When a process is complex it becomes difficult to view the whole process: it may require several pages and this makes difficult to see the total picture.
Making the VSM vertically allows us to visualize complex processes in a single page.
Another common practice is to map the process by sticking post-its on a wall or large wrapping paper. This has the advantage of simplicity but it creates some inconveniences. It becomes impractical to save all the work done to continue working on it later on. It is also difficult to communicate to others: photos are difficult to read. If you ask other people, not present in the VSM exercise, to give their input they can not modify the work you have done.
You can draw your VSM directly using Google Spread Sheets in Google Drive and give write access to your document to all participants in the team and it will take shorter to build the VSM than if you have to write and stick the post-its. You can even have team members participating from remote locations.
After the current VSM is concluded we give write access to all process participants including those who were not present in its elaboration, in this way they can make modifications to make sure the VSM reflects what is really going on in Gemba. Google Drive keeps track of who has made what modifications.
We also give read only access to the VSM in the cloud to other people and plot an A1 size copy to stick on the department or line wall for everyone to see.
We keep backup copies of the different stages in the modifications to make sure we can go back to a previous version.
I find it very useful to indicate who does what in the VSM instead of making another separate document. In order to do this we include participant swim lanes on the left which are easily filled while building the VSM.
In the example above you can notice that the switchboard participates in the process on several occasions.
More complex VSMs with feedback loops can be done with Excel such as:
Download this VSM example: Value Stream Map and Modeling
Doing a VSM in Excel has several advantages:
- The team involved in the VSM exercise can be located in remote locations and can all share a spread sheet in the cloud with communication via Hangouts or Skype.
- To insure your VSM represents the current reality (hidden factory) rather than a theoretical process flow you need to validate your VSM with all other participants who were not in the room: line operators, suppliers, maintenance, etc. You can not start improving until you are certain of what the current reality is.
- All the work done is not lost at the end of the meeting and it can be updated remotely at a later stage by all process participants
- You can easily keep track of the different versions of the VSM as you implement your improvements and you can even go back to a previous version if something goes wrong
- You can go to Gemba with your VSM on your tablet or smart phone and directly enter process times, WIP, etc. on your VSM in the cloud.
- Excel allows the addition of the process parameters for each process step by adding as many columns as required on the right:
· Process time
· Setup time
· Cycle time
· Work-In-Process waiting
Complex processes require often feedback loops: repair failing items and test them again. These loops are an essential part of the VSM because they often become the bottleneck of the total process.
When the process doesn’t have a single flow but there are several branches you need to estimate the proportion of items that will flow along each branch and this you can do by collecting data along a period of time.
Process times have variability so you will need to estimate not only their average values but also their standard deviations.
In this Repair process the average time it takes to receive a product is 2 min and its standard deviation 0.5 min.
20% of the products are under warranty so 80% will follow the “N” branch.
Average inter arrival time of products coming from the customer for repair is 5 min. This is the takt time the process has to comply with.
Not all products go through all process steps, therefore we need to calculate % throughput for each step based on the % of products flowing along each branch.
In order to balance the line we need the right staffing on each process step to insure enough capacity to handle the product arrivals. This means that Cycle time ≤ Inter arrival time (takt).
Cycle time = Average time x % Throughput / Staffing
We can calculate the staffing (Full Time Equivalent) required for each step M4 – M13 with Solver
To do this we calculate the total staff required in cell M1 just adding all the yellow cells in column M. This cell is what we want to minimize as long as we meet the requirements that all cycle times P4:P13 are below the inter arrival time in K3.
We are adding the constraint M4:M13 ≥ 0.1. The proposed staffing by Solver in the yellow cells M4:M13 is:
Which gives a total staff of 13.6 (M1). This is a theoretical minimum which assumes all staff has all skills, which is not the case.
In cells D1 to H1 we obtain the compound for each process participant (some perform several operations) obtained from the column M.
If Receive and Coordinator can be combined we may have to round up the sum of the two from the 1.6 requirement to 2.0
Invoicing we round up from 1.8 to 2. Test from 2.9 to 3 and repair from 7.4 to 8.
This manual rounding will lead us to this situation:
Where the total staff has increased from the theoretical 13.6 to 15 after considering the different skills required in the different steps.
This increase in the staffing has an effect on lowering the cycle of the steps affected below the takt time of 5 minutes.
Looking at the % Throughput column we notice that some steps only process 4% of the products to be repaired while test and repair have to process 123%. The reason for this is that only 80% pass the test OK the first time. This means 20% have to be tested a second time. Same with burn-in, where 10% fail and have to be retested.
So far we have not taken into account variability but we will do so by adding some additional columns to our VSM converting it into a simulator in our next article: How to convert your VSM into a simulator to experience variation