IV.
Six Sigma Analyze (15 Questions)
A.
Exploratory data analysis
1.
Multi-vari studies
Create and interpret multi-vari studies to interpret the difference between
positional, cyclical, and temporal variation; apply sampling plans to
investigate the largest sources of variation. (Create)
2.
Simple linear correlation and regression
Interpret the correlation coefficient and determine its statistical
significance (p-value); recognize the difference between correlation and
causation. Interpret the linear regression equation and determine its
statistical significance (p-value). Use regression models for estimation and
prediction. (Evaluate)
Exploratory data analysis
Multi-vari
analysis a chart
used to analyze variation, and also used to investigate the stability or consistency
of a process. It identifies where and where not to investigate, and the
principle advantage is that it breaks down variation into components so that
improvements can be made. It normally contains all (or most) of the readings
taken.
·
Positional variation within a piece
·
Cyclical variation from piece to piece
·
Temporal
variation caused
by time related changes
Simple
linear correlation and regression - a method
that enables you to determine the relationship between a continuous process
output (Y) and one factor (X). The relationship is typically expressed in terms
of a mathematical equation such as (Y = Bo + B1X). Bo=Y intercept
when X=0; B1=slope of line.
·
Best
fit line plot
the points on a graph and place a line through the majority of points or best
fit.
·
Least
squares choose
as the best fit line the line that minimizes the sum of the squares of the
deviations of observed values.
·
Multiple
linear regression
although not included in the BOK, it is an extension of simple linear
regression to more than one independent variable.
Correlation
coefficient
quantifies the degree of linear association between two variables. It is typically denoted by r and will
have a value ranging between negative 1 and positive 1. A positive value
implies that the line slopes upward to the right and a negative value indicates
that it slopes downward to the right. When r=0, all points are scattered and
give no evidence of a linear correlation; when r=1 or r=-1, all points fall on
a straight line; and any other value suggests the degree of linear relation.
·
Coefficient
of determination
the square of the linear correlation coefficient. Determines the amount of
variability explained by the regression model.
p-Value - The probability value (p-value) of a statistical hypothesis test
is the probability of getting a value of the test statistic as extreme as or
more extreme than that observed by chance alone, if the null hypothesis Ho, is
true. It is the probability of wrongly rejecting the null hypothesis if it is
in fact true. It is equal to the significance level of the test for which we
would only just reject the null hypothesis. The p-value is compared with the
desired significance level of our test and, if it is smaller, the result is
significant. That is, if the null hypothesis were to be rejected at the 5%
significance level, this would be reported as "p < 0.05". Small
p-values suggest that the null hypothesis is unlikely to be true. The smaller
it is, the more convincing the evidence is that null hypothesis is false. It
indicates the strength of evidence for say, rejecting the null hypothesis Ho,
rather than simply concluding "Reject Ho" or "Do not reject
Ho".