C.            Collecting and summarizing data

                                                       1.            Types of data and measurement scales
Identify and classify continuous (variables) and discrete (attributes) data. Describe and define nominal, ordinal, interval, and ratio measurement scales. (Analyze)

                                                       2.            Data collection methods
Define and apply methods for collecting data such as check sheets, coded data, etc. (Apply)

                                                       3.            Techniques for assuring data accuracy and integrity
Define and apply techniques such as random sampling, stratified sampling, sample homogeneity, etc. (Apply)

                                                       4.            Descriptive statistics
Define, compute, and interpret measures of dispersion and central tendency, and construct and interpret frequency distributions and cumulative frequency distributions. (Analyze)

                                                       5.            Graphical methods
Depict relationships by constructing, applying and interpreting diagrams and charts such as stem-and-leaf plots, box-and-whisker plots, run charts, scatter diagrams, Pareto charts, etc. Depict distributions by constructing, applying and interpreting diagrams such as histograms, normal probability plots, etc. (Create)

Types of data

·          Attribute – discrete; the values can only be integers; counted data

·          Variable – continuous; the values can be any real number; measured data

·          Locational - simply answers the question “where.”

Measurement scales

·          Nominal – data consists of names or categories only. No ordering scheme is possible.

·          Ordinal – data is arranged in some order but differences between values cannot be determined or are meaningless.

·          Interval – data is arranged in order and differences can be found.

·          Ratio – an extension of the interval level that includes an inherent zero starting point.

Data collection methods

·          Check sheets – used to tally attribute data. Not suited for variable data.

·          Measles chart – check sheet showing location data.

·          Coded data

Techniques for assuring data accuracy and integrity

·          Random sampling – every item has an equal chance of being selected for the sample.

·          Stratified sampling – random samples from each group that is different from similar groups.

Descriptive statistics

·          Dispersion

o         Range – the difference between the largest and smallest values.

o         Standard deviation – square root of the variance.

o         Variance – the standard deviation squared.

o         Coefficient of Variation (COV) – the standard deviation divided by the mean.

·          Central tendency – measures of central tendency represent different ways of characterizing the central value of a collection of data.

o         Mean – sum total of all data values divided by the number of data points.

o         Mode – the most frequently occurring number in a data set.

o         Median – the middle value when data is arranged in order.

·          Probability density function

·          Frequency distributions

·          Cumulative distributions

Graphical methods

·          Stem-and-leaf plots – effective for both variable and categorical data sets.

·          Box-and-whisker plots – also known as the box plot. A five number summary of the data.

·          Run charts – performance measure of a process over a specified period of time.

·          Scatter diagrams – can have a calculated correlation coefficient that measures “goodness of fit.” A correlation chart that represents the relationship between two different variables.

·          Pareto charts – show the “vital few” and the “trivial many.”

·          Histograms – frequently column graphs; displays the relative frequency of continuous data values. Reveals the amount of variation is a process.

·          Normal probability plots – most of data points near the centerline, or average; bell-shaped distribution. 99.7% of the data falls within 3 standard deviations of the mean.