point
Suresh .V. Menon

Principal Consultant

Digital Stream Consulting

Quantum Dynamics Of Six Sigma

Six Sigma is Denoted by the Greek alphabet σ which is shown in the table above and is called as Standard deviation. The father of Six Sigma is Bill Smith who coined the term Six Sigma and implemented it in Motorola in the 1980’s.

A company is operating at six sigma level implies that there are only 3.4 defects per million opportunities  for example an airline company operating at six sigma level  means that it loses only 3.4 baggage’s per million of the passenger it handles.

Note: Software Tools used in Six Sigma are Minitab 17; Sigma XL which are statistical Software’s and Quality Companion is a Six Sigma Project Management Software.

Below is Shown the Six Sigma Table and a graph explaining the meaning of various levels of Six Sigma.

Sigma Level

Defect Rate

Yield Percentage

2 σ

308,770 dpmo (Defects Per Million Opportunities)

69.10000 %

3 σ

66,811 dpmo

93.330000 %

4 σ

6,210 dpmo

99.38000 %

5 σ

233 dpmo

99.97700 %

6 σ

3.44dpmo

99.99966 %

Six Sigma is implemented in Five Phases which are  Define, Measure, Analyze, Improve, Control and we will discuss each phases in brief and the various methods used in Six Sigma.

Six Sigma identifies several key roles for its successful implementation.

  • Executive Leadership includes the CEO and other members of top management. They are responsible for setting up a vision for Six Sigma implementation. They also empower the other role holders with the freedom and resources to explore new ideas for breakthrough improvements by transcending departmental barriers and overcoming inherent resistance to change.
  • Champions take responsibility for Six Sigma implementation across the organization in an integrated manner. The Executive Leadership draws them from upper management. Champions also act as mentors to Black Belts.
  • Master Black Belts, identified by Champions, act as in-house coaches on Six Sigma. They devote 100% of their time to Six Sigma. They assist Champions and guide Black Belts and Green Belts. Apart from statistical tasks, they spend their time on ensuring consistent application of Six Sigma across various functions and departments.
  • Black Belts operate under Master Black Belts to apply Six Sigma methodology to specific projects. They devote 100% of their valued time to Six Sigma. They primarily focus on Six Sigma project execution and special leadership with special tasks, whereas Champions and Master Black Belts focus on identifying projects/functions for Six Sigma.
  • Green Belts are the employees who take up Six Sigma implementation along with their other job responsibilities, operating under the guidance of Black Belts.
  • Mikel Harry Coined the term Green Belt & Black Belt as one of the Pillars of Six Sigma

Define

The objectives within the Define Phase which is first phase in DMAIC framework of Six Sigma are:-

Define the Project Charter

  • Define scope, objectives, and schedule
  • Define the Process (top-level) and its stake holders
  • Select team members
  • Obtain Authorization from Sponsor

Assemble and train the team.             

Project charters the charter documents the why, how, who and when of a project include the following elements

  • Problem Statement
  • Project objective or purpose, including the business need addressed
  • Scope
  • Deliverables
  • Sponsor and stakeholder groups
  • Team members
  • Project schedule (using GANTT or PERT as an attachment)
  • Other resources required

Work break down Structure                                                            

It is a process for defining the final and intermediate products of a project and their relationship. Defining Project task is typically complex and accomplished by a series of decomposition followed by a series of aggregations it is also called top down approach and can be used in the Define phase of Six Sigma framework.

Now we will get into the formulas of Six Sigma which is shown in the table below.

Central tendency is defined as the tendency for the values of a random variable to cluster round its mean, mode, or median.

Central Tendency

Population

Sample

Mean/Average                  

µ =∑ (Xi)^2/N

XBar=∑ (Xi)^2/n

Median

Middle value of Data

Most occurring value in the data

Mode

Dispersion

 

 

Variance

σ ^2  = ∑(X- µ)^2/ N

S^2 = ∑(Xi- XBar)^2/ n-1

Standard Deviation

 σ=  ∑(Xi- µ)^2/ N

S  ∑(Xi- XBar)^2/ n-1

Range

Max-Min

Where mean is the average for example if you have  taken 10 sample of pistons randomly from the factory and measured their diameter the average would be sum of the  diameter of  the 10 pistons divided by 10 where 10 the number of observations the sum in statistics is denoted by ∑. In the above table X, Xi are the measures of the diameter of the piston and µ ,XBaris the average.

Mode is the most frequently observed measurement in the diameter of the piston that is if 2 pistons out 10 samples collected have the diameter as 6.3 & 6.3 then this is the mode of the sample and median is the midpoint of the observations of the diameter of the piston when arranged in sorted order.

From the example of the piston we find that the formulas of mean, median , mode does not correctly depict variation in the diameter of the piston manufactured by the factory but standard deviation formula helps us to find the variance in the diameter of the piston manufactured which is varying from the customer mentioned upper specification limit and lower specification limit.

The most important equation of Six Sigma is Y = f(x) where Y is the effect and x are the causes so if you remove the causes you remove the effect of the defect. For example headache is the effect and the causes are stress, eye strain, fever if you remove this causes automatically the headache is removed this is implemented in Six Sigma by using the Fishbone or Ishikawa diagram invented by Dr Kaoru Ishikawa.

Measure Phase:In the Measure phase we collect all the data as per the relationship to the voice of customer and relevantly analyze using statistical formulas as given in the above table. Capability analyses is done in measure phase.The process capability is calculated using the formula CP = USL-LSL/6 * Standard Deviation where CP = process capability index, USL = Upper Specification Limit and LSL = Lower Specification Limit.

The Process capability measures indicates the following

  1. Process is fully capable
  2. Process could fail at any time
  3. Process is not capable.

When the process is spread well within the customer specification the process is considered to be fully capable that means the CP is more than 2.In this case, the process standard deviation is so small that 6 times of the standard deviation with reference to the means is within the customer specification.

Example: The Specified limits for the diameter of car tires are 15.6 for the upper limit and 15 for the lower limit with a process mean of 15.3 and a standard deviation of 0.09.Find Cp and Cr what can we say about Process Capabilities ?

Cp= USL-LSL/ 6 * Standard deviation = 15.6 – 15 / 6 * 0.09 = 0.6/0.54 = 1.111

Cp= 1.111

Cr = 1/ 1.111 = 0.9

Since Cp is greater than 1 and therefore Cr is less than 1; we can conclude that the process is potentially capable.

Analyze Phase:

In this Phase we analyze all the data collected in the measure phase and find the cause of variation. Analyze phase use various tests like parametric tests where the mean and standard deviation of the sample is known and Nonparametric Tests where the data is categorical for example as Excellent, Good, bad etc.

Parametric Hypothesis Test -A hypothesis is a value judgment made about a circumstance, a statement made about a population .Based on experience an engineer can for instance assume that the amount of carbon monoxide emitted by a certain engine is twice the maximum allowed legally. However his assertions can only be ascertained by conducting a test to compare the carbon monoxide generated by the engine with the legal requirements.

If the data used to make the comparison are parametric data that is data that can be used to derive the mean and the standard deviation, the population from which the data are taken are normally distributed they have equal variances. A standard error based hypothesis testing using the t-test can be used to test the validity of the hypothesis made about the population. There are at least 3 steps to follow when conducting hypothesis.

1.Null Hypothesis: The first step consists of stating the null hypothesis which is the hypothesis being tested. In the case of the engineer making a statement about the level of carbon monoxide generated by the engine , the null hypothesis is

H0: the level of carbon monoxide generated by the engine is twice as great as the legally required amount. The Null hypothesis is denoted by H0

2. Alternate hypothesis: the alternate (or alternative) hypothesis is the opposite of null hypothesis. It is assumed valid when the null hypothesis is rejected after testing. In the case of the engineer testing the carbon monoxide the alternative hypothesis would be

H1: The level of carbon monoxide generated by the engine is not twice as great as the legally required amount.

3. Testing the hypothesis: the objective of the test is to generate a sample test statistic that can be used to reject or fail to reject the null hypothesis .The test statistic is derived from Z formula if the samples are greater than 30.

Z = Xbar-µ/σ/ √n

If the samples are less than 30, then the t-test is used

T= X bar -µ/ s/√n where X bar and µ is the mean and s is the standard deviation.

1-Sample t Test (Mean v/s Target) this test is used to compare the mean of a process with a target value such as an ideal goal to determine whether they differ it is often used to determine whether a process is off center

1 Sample Standard Deviation This test is used to compare the standard deviation of the process with a target value such as a benchmark whether they differ often used to evaluate how consistent a process is

2 Sample T (Comparing 2 Means) Two sets of different items are measured each under a different condition there the measurements of one sample are independent of the measurements of other sample.

Paired T The same set of items is measured under 2 different conditions therefore the 2 measurements of the same item are dependent or related to each other.

2-Sample Standard This test is used when comparing 2 standard deviations

Standard Deviation test This Test is used when comparing more than 2 standard deviations

Non Parametric hypothesis Tests are conducted when data is categorical that is when the mean and standard deviation are not known examples are Chi-Square tests, Mann-Whitney U Test, Kruskal Wallis tests & Moods Median Tests.

Chi-Square test:-

Chi Square goodness of fit test:-

Fouta electronics and ToubaInc are computer manufacturers that use the same third party call centre to handle their customer services. ToubaInc conducted a survey to evaluate how satisfied its customers were with the services that they receive from the call centre, the results of the survey are given in the table below.

Table A

Categories

Rating %

Excellent

10

Very good

45

Good

15

Fair

5

Poor

10

Very poor

15

After having the results of the survey, Fouta electronics decided to find out whether they apply to its customers, so it interviewed 80 randomly selected customers and obtained the results as shown in table below.

Categories

Rating(absolute value)

Excellent

8

Very good

37

Good

11

Fair

7

Poor

9

Very poor

8

To analyze the results the quality engineer at Fouta electronics conducts a hypothesis testing. However in this case, because he is faced with categorical data he cannot use a t-test since a t-test relies on the standard deviation and the mean which we cannot obtain from either table. We cannot deduct a mean satisfaction or a standard deviation satisfaction. Therefore another type of test will be needed to conduct the hypothesis testing. The test that applies to this situation is the Chi-square test, which is a non-parametric test.

Step 1: State the hypothesis

The null hypothesis will be

H0: The results of Toubainc Survey = the same as the results of Fouta electronics survey

And the alternate hypothesis will be

H1: The results of ToubaIncSurvey ≠ the same as the results of Fouta electronics survey.

Step 2: Test Statistic to be used: The test statistic to conduct hypothesis testing is based on the calculated

ᵡ ^2 = ∑ (f0-fe) ^2/ fe where fe represents the expected frequencies and f0 represents the observed frequencies

Step 3: Calculating the X^2 test statistic:-

We will use the same table as given above if a sample of 80 customers were surveyed the data in the table

Table A

Categories

Rating %

Expected frequencies fe

Excellent

10

0.10 * 80 =8

Very good

45

0.45 * 80 = 36

Good

15

0.15 * 80 = 12

Fair

5

0.05 * 80 = 4

Poor

10

0.10 * 80 = 8

Very poor

15

0.15 * 80 = 12

Total

100

80

We can summarize the observed frequencies and the expected frequencies in the table given below:

Categories

Observed Frequencies fo

Expected Frequencies fe

(f0-fe)^2/fe

Excellent

8

8

0

Very good

37

36

0.028

Good

11

12

0.083

Fair

7

4

2.25

Poor

9

8

0.125

Very poor

8

12

1.33

Total

80

80

3.816

 

Therefore we conclude X^2 = ∑ (f0-fe) ^2/ fe = 3.816.

Now that we have found the calculated X^2 we can find the critical X^2 from the table. The Critical X^2 is based on the degree of freedom and the confidence level since the number of categories is 6 the degree of freedom df is = 6-1=5, if the confidence level is set at 95%, α = 0.05; therefore the critical x^2 is equal to 11.070 which we get from Chi-Square Table.

Since the critical X^2 0.05,5 = 11.070 is much greater than the calculated X^2 =3.816 we fail to reject the null hypothesis and we have to conclude that the surveys done by ToubaInc and Fouta Electronics gave statistically similar results.

In Minitab go to assistant click on hypothesis test and select chi-square goodness of fit test under more than 2 samples and take the sample data as umbrella from Help.

Chi-Square using Tools:  Click on SigmaXL->Select Chi-Square-Goodness test->Enter the values as shown in the table above you will get the same results. As the excel sheet is macro enabled.

Mood’s Median Test

This test is used to determine whether there is sufficient evidence to conclude that samples are drawn from populations with different medians

Ho: The Population from which the samples are drawn have the same median

Ha: At least one of the populations has a different median

The procedure is best illustrated with an example.

Example: - Three machines produce plastic parts with a critical location dimension samples are collected from each machine with each sample size at least 10. The Dimension is measured on each part. Does Mood’s median test permit rejection at the 90% significance level of the hypothesis that the median dimensions of the three machines are the same?

The Data:

Machine # 1: 6.48, 6.42, 6.47, 6.49, 6.47, 6.48, 6.46, 6.45, 6.46

      N= 10

Machine # 2 : 6.42,6.46,6.41,6.47,6.45,6.46,6.42,6.46,6.46,6.48,6.47,6.43,6.48,6.47

    N = 14

Machine # 3: 6.46,6.48,6.45,6.41,6.47,6.44,6.47,6.42,6.43,6.47,6.41

   N = 11

Procedure:

  1. Find the overall median of all 35 readings , overall median = 6.46
  2. Construct a table with one column showing number of readings above the overall median and another showing number of medians below the overall category median or each category. Half of the readings that are equal to the median should be counted in the “above” column and half in the “below” column.

Observed Values

 

Number above the overall Median

Number below the overall median

Machine #1

  1.  
  1.  

Machine #2

  1.  
  1.  

Machine #3

  1.  
  1.  

If the three Populations have the same median, each should have half their readings in the “above” column and half “below”. Therefore the expected value for number of readings would be half the sample size in each case, so the expected values would be as shown in the table.

Expected Values

 

Number above overall median

Number below overall median

Machine # 1

  1.  
  1.  

Machine # 2

  1.  
  1.  

Machine # 3

  1.  
  1.  
 

The next step is to calculate (observed – expected) ^2 / expected. For each cell in the table

The sum of these values in these cells =1.96 this is ᵡ ^2 test statistic with degrees of freedom k-2 where k is the number of machines. Since the critical value for ᵡ ^2 1, 0.95 = 3.84 the data do not justify rejecting the null hypothesis.

(Observed-Expected)^2/ expected

 

Number above overall median

Number below overall median

Machine # 1

2^2/5 = 0.8

2^2/5 = 0.8

Machine # 2

  1.  
  1.  

Machine # 3

  1.  

1/5.5 =0.18

 

 

 

 

 

To find the P-value for 1.96 using MS Excel, choose the statistical function CHIDIST and enter 1.96 for ᵡ ^ 2 and 1 for the degrees of freedom. You get the value 0.161513.

Mann-Whitney Test (Non-Parametric Test)

Mann-Whitney test is better explained thru an example

Example: An operations manager wants to compare the number of inventory discrepancies found in two operating shifts. The inventory discrepancies are not normally distributed. The manager takes a sample of discrepancies found over 7 days for the first shift and 5 days for the second shift tabulates the data as shown below

First Shift

Second Shift

15

17

24

23

19

10

9

11

12

18

13

 

16

 

 

We can make several observations from this table. First the sample sizes are small and we only have 2 samples so the first thing that comes to mind would be to use t-test, however the t-test assumes that the population from which the samples are taken should be normally distributed which is not the case therefore the t-test cannot be used, instead the Mann-Whitney U Test should be used. The Mann-Whitney U test assumes that samples are independent and from dissimilar populations

Step 1: Define the Null Hypothesis:  Just as the case of T-Test the Mann-Whitney U Test is a hypothesis test the null and alternate hypothesis are

H0: The number of Discrepancies in the first shift is the same as the one in the second shift

H1: The number of Discrepancies in the first shift is different from the ones in the second shift

The result of the test will lead to the rejection of the null hypothesis or a failure to reject the null hypothesis.

Step 2: Analyse the data: The first step in the analysis of the data consists of naming the groups. In our case they are already named first and second shift. The Next Step consists of grouping the two columns in one and sorting the observations in ascending order ranked from 1 to n. Each observation is paired with the name of the original group to which it belonged

The table below illustrates the fact:

Observations

Group

Ranks

9

First Shift

1

10

Second Shift

2

11

Second Shift

3

12

First Shift

4

13

First Shift

5

15

First Shift

6

16

First Shift

7

17

Second Shift

8

18

Second Shift

9

19

First Shift

10

23

Second Shift

11

24

First Shift

12

 

We will call v1 the sum of the ranks of the observations of the first group called as first shift and v2 the sum of the ranks of the observations for group second shift

                        V1= 1+4+5+6+7+10+12 =45

                        V2 = 2+3+8+9+11 = 33

Step 3: Determining the values of U Statistic by the formula given below

                                    U1 = n1*n2 + n1*(n1+1) / 2 – v1

                                    U2 = n1 * n2 + n2 (n2 + 1) / 2 – v2

If any or both of the sample size are greater than 10 , then U will be approximately normally distributed and we could use Z transformation but in our case both sample sizes are less than 10 there fore

 U1 = 7*5 +7(7+1)/2 – 45 = 35 + 28 -45 =18    
 U2 = 7 * 5 + 5(5+1) / 2 -33 = 35 + 15-33 = 17
Since the calculated test statistic should be the smallest of the two we will use U2 = 17 and n2=7 and n1=5 from the Mann Whitney table we find that p value is  5 therefore we fail to reject the null hypothesis and have to conclude that the number of discrepancies in the first shift are the same as the ones in the second shift.

Anova

If for instance 3 sample means A, B, C are being compared using the t-test is cumbersome for this we can use analysis of variance ANOVA can be used instead of multiple t-tests.

ANOVA is a Hypothesis test used when more than 2 means are being compared.

If K Samples are being tested the null hypothesis will be in the form given below

H0: µ1 = µ2 = ….µk

And the alternate hypothesis will be

H1: At least one sample mean is different from the others

If the data you are analyzing is not normal you have to make it normal using box cox transformation to remove any outliers (data not in sequence with the collected data).Box Cox Transformation can be done using the statistical software Minitab.

Improve Phase:In the Improve phase we focus on the optimization of the process after the causes are found in the analyze phase we use Design of experiments to remove the junk factors which don’t contribute to smooth working of the process that is in the equation Y = f(X) we select only the X’s which contribute to the optimal working of the process.

Let us consider the example of an experimenter who is trying to optimize the production of organic foods. After screening to determine the factors that are significant for his experiment he narrows the main factors that affect the production of fruits to “light” and “water”. He wants to optimize the time that it takes to produce the fruits. He defines optimum as the minimum time necessary to yield comestible fruits.

To conduct his experiment he runs several tests combining the two factors (water and light) at different levels. To minimize the cost of experiments he decides to use only 2 levels of the factors: high and low. In this case we will have two factors and two levels therefore the number of runs will be 2^2=4. After conducting observations he obtains the results tabulated in the table below.

Factors

Response

Water –High Light High

10 days

Water high – Light low

20 days

Water low – Light high

15 days

Water low – Light low

25 days

Control Phase: In the Control phase we document all the activities done in all the previous phases and using control charts we monitor and control the phase just to check that our process doesn’t go out of control. Control Charts are tools used in Minitab Software to keep a check on the variation. All the documentation are kept and archived in a safe place for future reference.

Conclusion:From the paper we come to understand that selection of a Six Sigma Project is Critical because we have to know the long term gains in executing these projects and the activities done in each phase the basic  building block is the define phase where the problem statement is captured and then in measure phase data is collected systematically against this problem statement which is further analyzed in Analyze phase by performing various hypothesis tests and process optimization in Improve phase by removing the junk factors that is in the equation y = f(x1, x2,x3…….) we remove the causes x1, x2 etc. by the method of Design of Experiments and factorial methods. Finally we can sustain and maintain our process to the optimum by using control charts in Control Phase.