Analyze

The goal of this chapter is to present you with a general feeling for what Six Sigma is about: its focus, structure, and emphasis on data. Like ISO 9001:2000 and CMMI, Six Sigma is an in-depth process improvement program. The intention of this book is to give you a summary of each of these three leading standards so that you can begin to assess which one might be right for you, or—better yet—what parts of each might help you reach your quality goals. And that brings us to the A in DMAIC.

The A in DMAIC is for analysis, analysis of the data you have collected. This is a big subject, and it is not one that this chapter can explore to the depth that might be warranted. But the central idea here—and the central activity—is to analyze the data you've collected in order to determine the root causes of defects or poor performance, and then to establish an empirical basis for improving the process.

The key is to identify root causes of process variation, or instability, not just the symptoms. The symptoms are almost always pretty easy to spot, and they often appear to be easy to fix. A jammed printer is a good example. The paper is all crumpled up around the roller. So we take out the sheet, and we're ready to go again. We fixed the symptom, but chances are, the problem is still there: a dirty roller, a misaligned sheet feeder.

With Six Sigma, the story is always in the data. That's the story of how your systems are really performing. And with Six Sigma, the solution is always in the data, too.

Data analysis can be simple or it can be complex. In traditional Six Sigma projects, quite a few complex statistical and quantitative analyses can be used. We take a very brief look at some of these in the next section. But the techniques don't always have to be complex.

For example, you might run a process and then collect data on process performance values. Say you have a process to create a Configuration Management Plan for software projects. You gather the time it takes your configuration analysts to create plans for 12 projects over the period of four months. You look at the data. The value set might look like this: (3 hrs, 4 hrs, 2 hrs, 7 hrs, 3.5 hrs, 2.9 hrs, 3 hrs, 2.5 hrs, 4.5 hrs, 5 hrs, 4 hrs, 4.5 hrs). You total those values up and get 45.9. You then divide that by 12 and get 3.8 hrs.

That's the average amount of time it takes your configuration analysts to create a Configuration Management Plan for one of your software projects.

If that number seems high to you, you might look a little deeper. You might look at your plan template and see if maybe it's calling for too much information or if maybe the template is somehow confusing. You might check to see if the analysts have been properly trained in how to fill out the template. This data can point you to multiple improvement potentials.

The point is that you did a simple average analysis to get this insight.

The kinds of analyses used on project data will naturally vary from organization to organization. This is influenced by factors such as the type of problems being investigated, the kinds of data collected, the capabilities of your team, and the kinds of solutions you are looking for.

Six Sigma typically employs formulations for such indicators as:

These are great tools and techniques, but even though Six Sigma has a deep foundation in these capabilities, you don't have to feel that you are honor-bound to adopt these for your Six Sigma projects. Use the analytical techniques—sophisticated or simple—that best help you understand your data in meaningful ways.

Figure 7-5 illustrates the analyze phase of DMAIC.

In the analyze phase of DMAIC, the objective is to draw the performance out of the data. This is where most of the well-known Six Sigma techniques come into play. The use of histograms, measures of central tendency, control chart derivation, process capability indexes, and process sigmas can all be used in this phase.

Figure 7-5. In the analyze phase of DMAIC, the objective is to draw the performance out of the data. This is where most of the well-known Six Sigma techniques come into play. The use of histograms, measures of central tendency, control chart derivation, process capability indexes, and process sigmas can all be used in this phase.

I'll repeat my basic premise about our look at Six Sigma here: this book is not intended to be a complete tome on the statistical and analytic techniques applicable to Six Sigma projects. The intention is to give you a pretty good feel for the structure and focus of Six Sigma. If it seems that this program may be helpful to your organization, then you can move forward to deeper investigation. However, in this section, we'll take a very topical look at a typical statistical path a team might follow when analyzing data on a Six Sigma project.

Once you have plotted your data points as a histogram you can examine the shape of the data. Figures 7-6, 7-7, and 7-8 are three sample histograms with valid shapes.

The histogram in Figure 7-6 shows math, reading, and writing scores for school children at varying school grades. Notice that as a child advances in school, scores go up. This is to be expected, and so the shape of this data is said to be valid.

Figure 7-7 shows what appears to be an opposite example.

The histogram in Figure 7-7 has the opposite shape, and so we might think that it is invalid. But this is a histogram of fatigue factors. It shows three people's abilities to lift weights across five exercise sets. The graph shows that the people lift less weight as the number of attempts increase. This is also to be expected, and so the shape of this data is said to be valid.

The shape in Figure 7-8 is also a valid shape. It is the shape of random data. It is the traditional bell curve, with a central tendency and values falling away about equally on either side.

Valid data shapes are defined as statistical realities. They are not based on simple "expected shapes." But it is safe to say that all data has a valid shape and that it is important to ascertain the shape of your data before you move into further analyses.

Now you might want to determine the process sigma. This is done by determining the process yield. Run your process and count the defects (or generate %NC as described earlier). The process yield is calculated by subtracting the total number of defects from the total number of opportunities, dividing by the total number of opportunities, and finally multiplying the result by 100.

Here's an example.

You run a process that produces 18 defects.

You know from the process analysis that there were 12,500 opportunities for defects—chances where defects could have crept in. So you subtract 18 from 12,500 and get 12,482.

You then divide 12,482 by 12,500 to get .99856. You multiply .99856 by 100 to get 99.856. That is your process yield.

The final step is to use the process yield and look up the value on a sigma conversion table, such as the following:

Yield %

Sigma

Defects Per Million Opportunities

99.9997

6.00

3.4

99.9995

5.92

5

99.9992

5.81

8

99.9990

5.76

10

99.9980

5.61

20

99.9970

5.51

30

99.9960

5.44

40

99.9930

5.31

70

99.9900

5.22

100

99.9850

5.12

150

99.9770

5.00

230

99.9670

4.91

330

99.9520

4.80

480

99.9320

4.70

680

99.9040

4.60

960

99.8650

4.50

1350

99.8140

4.40

1860

99.7450

4.30

2550

99.6540

4.20

3460

99.5340

4.10

4660

99.3790

4.00

6210

99.1810

3.90

8190

98.9300

3.80

10700

98.6100

3.70

13900

98.2200

3.60

17800

97.7300

3.50

22700

97.1300

3.40

28700

96.4100

3.30

35900

95.5400

3.20

44600

94.5200

3.10

54800

93.3200

3.00

66800

91.9200

2.90

80800

90.3200

2.80

96800

88.5000

2.70

115000

86.5000

2.60

135000

84.2000

2.50

158000

81.6000

2.40

184000

78.8000

2.30

212000

75.8000

2.20

242000

72.6000

2.10

274000

69.2000

2.00

308000

65.6000

1.90

344000

61.8000

1.80

382000

58.0000

1.70

420000

54.0000

1.60

460000

50.0000

1.50

500000

46.0000

1.40

540000

43.0000

1.32

570000

39.0000

1.22

610000

35.0000

1.11

650000

31.0000

1.00

690000

28.0000

0.92

720000

25.0000

0.83

750000

22.0000

0.73

780000

19.0000

0.62

810000

16.0000

0.51

840000

14.0000

0.42

860000

12.0000

0.33

880000

10.0000

0.22

900000

8.0000

0.09

920000

Based on the table, your process is operating at 4.5 sigma. Congratulations.

The analysis phase can employ a good number of statistical tools and analytical techniques. Some of the common ones include: