Six Sigma Glossary & Definitions

Alpha Risk – Risk of concluding that two characteristics are different when they are actually the same.

Alternative Hypothesis – Statement of change or difference. The statement is assumed correct if the null hypothesis is not supported by the data.

ANOVA – Analysis of Variance – Hypothesis test used to determine whether two or more sample means are significantly different.

Attribute Data – Sometimes used in place of the term discrete data. Attribute data is qualitative data that has two outcomes. Examples include pass/fail, tastes good/tastes bad and acceptable/not acceptable. Attribute data is used specifically in reference to Attribute Control Charts, which are limited to counting and plotting defects or defectives.

Attribute Gage R&R – A Gage Repeatability & Reproducibility (Gage R&R) study used to determine the measurement variability when attribute data is used.

Backbone Test – A main supporting tool or factor used in statistical analysis.

Bartlett‘s Test – Type of hypothesis test that compares variances of two or more normally distributed populations.

Baseline – The “as-is” level of performance of the process being investigated.

Baseline Control Chart – A graph of the initial process performance over time that is used to detect whether common or special cause variation exists in a process.

Beta Risk – Risk of concluding that two characteristics are the same when they are actually different.

Between Variance – Between variance is based on the distances of the sample means from the overall mean. If the samples come from populations with equal means, one would not expect an undue amount of variation in the sample means.

Bias – Bias within a measurement system describes the difference between the observed average and a reference value of the measurements. Bias is related to accuracy.

Binomial Distribution – A distribution of probabilities used when there are only two independent outcomes to the hypothesis test (e.g. heads/tails, yes/no).

Box Plot – A type of plot which identifies variability and centering based on quartiles. It uses a rectangular box to represent the middle 50% of the data and whiskers to show the extent of the data. A Box Plot requires data from a continuous output and a discrete (typically attribute) input at one or more levels.

C Chart – A graphical display of number (counts) of defects in a subgroup (part or unit) with a constant sample size. A C Chart can be used to detect special cause variation.

Capability Analysis – A statistical measure which helps determine how well a process meets customers’ standards.

Capability Indices – Indices describing the overall effectiveness of a process in meeting specific criteria in both the short and long term.

C&E Analysis – Cause and Effect Analysis – The Cause and Effect Analysis (C&E Analysis) is an analysis in which either causes to an effect are brainstormed (the Cause and Effect Diagram), or multiple causes, which have already been identified, are prioritized against several criteria (the Cause and Effect Matrix). C&E Analyses are typically used when limited data is available

C&E Diagram – Cause and Effect Diagram – The Cause and Effect Diagram (C&E Diagram) is a fishbone-shaped diagram that helps a team brainstorm potential root causes of a defect. The problem to be solved is placed at the head of the fish and the six thought-generating categories – Personnel, Machines, Materials, Methods, Measurements and Environment – are placed at the ends of the fish bones.

C&E Matrix – Cause and Effect Matrix – The Cause and Effect Matrix (C&E Matrix) is a tool that allows one to prioritize many items when there are many criteria they need to be prioritized against.

Chi-Square Distribution – A sampling distribution used to determine the confidence interval for standard deviation and to perform Chi-Square Hypothesis Tests: Goodness of Fit Test and Test for Association.

Chi-Square Hypothesis Test – Hypothesis test used to test the goodness-of-fit between a sample and a hypothesized distribution, or the association of two or more variables.

Classical Yield – The ratio of the number of units that ultimately pass through the entire process to the number of units that enter into the process. See also Final Yield, which equals output divided by input.

Clustering – A type of special cause variation within a Run Chart in which the data set exhibits fewer runs around the median than expected, resulting in a “clustering” of data points in just a small range of values or a shift in the average values.

Coefficient of Determination – A measure of the correlation between the dependent and independent variables in a Regression Analysis.

Common Cause Variation – Process variability that is free of assignable cause. It is typically associated with short-term variability or subgroup variability. Also referred to as white noise or expected variation.

Confidence Interval – A range of numbers in which population parameters are likely to fall.

Confidence Level – The fixed probability of correctly accepting the null hypothesis.

Contingency Table – A table of observed frequencies used in Chi-Square Hypothesis Tests.

Continuous Data – Continuous data, sometimes referred to as variable data, is data that is measured on a continuum or a scale that can be meaningfully divided into finer and finer increments of precision. For example, length and weight can be measured to any desired level of precision.

Continuous Gage R&R – A Gage Repeatability & Reproducibility (Gage R&R) study used to determine the measurement variability when continuous data is used.

Control Chart – A process control tool, in which data is plotted and statistically analyzed in order to discern whether the process exhibits common and/or special cause variation.

Control Limit – Control Limits are typically set +/-3 standard deviations from the centerline to determine whether or not the process is in control.

Control Plan – A single document or set of documents that provides a point of reference among the Key Process Input Variables (KPIV’s), Key Process Output Variables (KPOV’s), specifications and instructions for the completed project. It documents the actions, including schedules and responsibilities that are needed to control the KPIV’s at their optimal settings.

Controllable Input – A process input that has adjustable settings, which can be modified by the project team during a project.

Correlated – Data identified as having a causal, complementary, parallel or reciprocal relationship.

Correlation – Statistical analysis that determines whether or not one variable can be used to predict another. Correlation does not necessarily prove causation.

Correlation Coefficient – A measure of the interdependence of two random variables that range in value from -1 to +1, indicating perfect negative correlation at -1, absence of correlation at zero and perfect positive correlation at +1.

Cp – A capability index that measures the potential capability of a process to meet expected specification limits, or tolerance levels, in the short term, assuming the process is ideally centered (regardless of where the process is actually centered).

Cpk – A capability index that measures the potential capability of a current process to meet expected specification limits, or tolerance levels, in the short term, using the current process average.

Critical Input Variable – An input variable (X) that has been statistically proven to impact the process output (Y).

CTC – Critical to Cost – Reference to a product, service and/or transactional characteristic that significantly influences a customer in terms of cost.

CTD – Critical to Delivery – Reference to a product, service and/or transactional characteristic that significantly influences a customer in terms of delivery.

CTQ – Critical to Quality – Reference to a product, service and/or transactional characteristic that significantly influences a customer in terms of quality.

CTS – Critical to Satisfaction – Expression of customers’ vital needs and can include any of the CTC, CTQ and CTD requirements.

Curvilinear Relationship – A quadratic relationship evident between two variables, that approximates a curved line on a Scatter Plot. It indicates that one variable depends on the squared value of the other.

Customer Specification Limits – Customer-defined limits for acceptable outputs. When a measured value falls outside these limits, a defect has been created.

DPM – Defects per Million – Measure of total defects per million units.

DPMO – Defects per Million Opportunities – Equals the total number of defects per unit divided by the total number of opportunities for defects per unit multiplied by 1,000,000.

DPO - Defects per Opportunity – Measure of the total defects per unit divided by the opportunities per unit.

DPU – Defects per Unit – Calculated by dividing the total number of defects by the total number of units. Also used to calculate Rolled Throughput Yield.

Degrees of Freedom – Degrees of freedom (df) of a test statistic equals the number of independent observations in a sample minus the number of population parameters that are estimated from the sample observation.

DFSS – Design for Six Sigma – Systematic methodology for incorporating VOC at product design time ensuring a Six Sigma level

DOE – Design of Experiments – Planned experiments that allow for the simultaneous statistical analysis of several PIVs to determine their effects on any measurable POV. Used to prove correlation and causation.

Deviation – The distance between a data point and the mean. Deviation measures and describes the variation in a set of data.

Discrete Data – Numeric data that is not capable of being meaningfully subdivided into more precise increments.

Discrimination – The ability of the measurement system to adequately detect the smallest tolerable changes within the process.

Distribution – A pattern or tendency depicted by randomly collected observations from a population.

DMAIC – Design, Measure, Analyze, Improve, Control – Six Sigma methodology applied to manufacturing or production processes.

Empirical Rule – A rule derived from observations that the probabilities of empirical (real world) distributions will approximate to 68%, 95% and 99.7% of the values within one, two and three standard deviations of the mean, even though the real-world distributions may not be perfectly normal.

FMEA – Failure Mode and Effects Analysis – Failure Mode and Effects Analysis is an analytical technique that allows a project team to ensure that, to the extent possible, potential failure modes and their associated causes/mechanisms have been evaluated and addressed to mitigate any risk to the customer.

Final Yield – Final Yield (YF) is a measure of the percentage of units that passed the final process test relative to the number of units that entered the process.

F-Ratio – The result of an F-Test. It indicates the measure of between-to-within variation.

Frequency – The ratio of the number of times an event occurs in a series of trials in a random experimental trial to the total number of trials in that experiment.

Frequency Distribution – A display indicating how often a particular observation or data value occurs and representing the distribution of data.

F-Test – Type of hypothesis test used to determine whether or not the within variance and between variance are the same.

Gage Control Plan – Documentation describing the strategy employed to ensure the reliability and adequacy of the measurement system in measuring input variables and monitoring output variables over the long term.

Hypothesis Test – A test in which the project team assumes an initial claim, the null hypothesis, to be true and then tests this claim against an alternative hypothesis using sample data.

I-MR Chart – A type of Control Chart for variable data that plots individual data and the moving range of the present and previous individuals.

Interaction Plot – A type of plot that graphs the averages of the output variable for each level of a factor, with the level of a second factor held constant for all combinations of levels. Interaction Plots readily show the presence of interactions; parallel lines in an Interaction Plot indicate no interaction, while greater departure from the parallel state indicates a higher degree of interaction.

KPIV – Key Process Input Variable – An input (X) that has been determined to have a statistically significant and causal relationship to the KPOV.

KPOV – Key Process Output Variable – Any outputs from a process that satisfies CTS  requirements.

Linearity – The ability of the measurement system to measure over its operating range with minimal bias.

LCL – Lower Control Limit – Limit in the Control Chart that is set below the centerline, typically at -3 standard deviations, in order to determine whether or not the process is in control.

LSL – Lower Specification Limit – Limit of a tolerance range specified by a CTS. A measured value that is below this limit is considered a defect.

Mean – A measure of the central tendency of a data set. It is the sum of a set of values divided by the number of summed values.  Same as average.

MSA – Measurement Systems Analysis – A series of designed tests used to assess measurement system capability.

Measurement Variability – The net effect of all the sources of measurement error that cause an observed value to deviate from the true value of the characteristic that is being measured.

Median – Center or middle number in a group of rank-ordered numbers (1, 4, 5, 9, 14)

Mode – The most frequently observed value in a dataset.

Multiple Linear Regression – A linear regression with two or more predictors. The estimation of the output variable from two or more continuous input variables using a linear relationship (straight line or plane) between the output and input variables.

Noise – The Process Input Variables (PIV’s) with settings that cannot be adjusted or controlled by the project team.

Normal Distribution – A distribution of data described by the mean and standard deviation. The curve displaying the distribution of data is shaped like a bell, with the area under the curve representing 100% of all possible observations.

Normal Probability Plot – A type of plot that represents normally distributed sample data.

Normalized Yield – Normalized Yield (YNORM) is the equalized (same) yield assigned for each process step; it is the geometric average of the Rolled Throughput Yield (RTY) for the entire process.

NP Chart – This tool tracks the number of defectives (products, parts or units that do not conform to specified standards) in each subgroup and detects the presence of special cause variation. It can only be used when the sample size is constant.

Null Hypothesis – A statement about population parameters, typically implying “no effect” or “no difference.” This statement is assumed true until sufficient evidence is presented otherwise.

OFAT – One-Factor-at-a-Time – One-Factor-at-a-Time (OFAT) is an experimental design setup, in which each factor is varied one at a time while the remaining factors are held constant. OFAT is done in order to estimate the effect of a single variable on selected fixed conditions of other variables.

Outlier – Data point that is markedly inconsistent with the rest of the data set.

Output – The result of a process or product and its measurable characteristics. Also called Process Output Variable (POV).

P Chart – A P Chart tracks the proportion of defectives (products, parts or units that do not conform to specified standards) in each subgroup and detects the presence of special cause variation. It can be used for sample sizes that are either constant or variable.

Paired Sample – A sample in which each sampling unit contains a pair of observations that are not independent of one another (e.g. sampling ages of husband and wife as a single sampling unit).

Paired t Test – A type of hypothesis test used when analyzing the difference between the means obtained from paired samples.

Pareto Chart – Type of chart that compares the frequency and/or influence of various types of problems or causes of a problem. The horizontal axis represents the different categories, and the height of the bar represents the frequency of that category. Pareto Charts prioritize possible areas for improvement and require either discrete or continuous data.

Pareto Principle – Also referred to as Pareto Effect, the Pareto Principle states that 80% of the issues are the result of 20% percent of the causes.

Pearson Correlation – The most commonly used method of computing a correlation coefficient between variables that are linearly related.

Population – The entire collection or set of objects or individuals from which a sample is drawn for analysis. Population can also mean a set of values or characteristics of the members of the population, e.g. heights of people in the world.

Pp – A capability index that measures the potential capability of a process to meet expected specification limits, or tolerance levels, in the long term, assuming the process is ideally centered (regardless of where the process is actually centered).

Ppk – A capability index that measures the potential capability of a current process to meet expected specification limits, or tolerance levels, in the long term, using the current process average.

PIV – Process Input Variable – A Process Input Variable (PIV) is a characteristic of materials, equipment, information or any other resource that is needed to carry out a process. In other words, PIV’s are the X’s and potential X’s in the Y = f(x) equation.

POV – Process Output Variable – A Process Output Variable (POV) is a characteristic of a product or service that is created by the process and is passed onto the next process step or the customer. In other words, the POV is the Y in the Y = f(x) equation.

Process Specification Limits – Limits that reflect the customer specifications as allocated to the inputs and outputs of a process. Process Specification Limits apply either to Process Input Variables (PIV’s) or to Process Output Variables (POV’s). When a PIV falls outside these limits, the process may not function as designed. When a POV falls outside these limits, the process produces a defect.

Proportion – The percentage of a data set that has a specific characteristic of interest.

Proportion of Sample – The fraction of samples which exhibit a characteristic of interest.

p-Value – The probability of rejecting the null hypothesis when it is true.

QFD – Quality Function Deployment – Quality Functional Deployment (QFD) is a systematic methodology to integrate the Voice of the Customer (VOC) into the design and delivery of goods and services.

Quartile – The four equal parts into which a rank-ordered data set can be divided.

Range – A measure of variation within a data set. It is calculated by subtracting the minimum value from the maximum value.

Regression Analysis – A statistical technique used to investigate relationships between the output variable and one or more input variables.

Repeatability – The extent to which repeated measurements, made on the same item under absolutely identical conditions, produce the same result.

Reproducibility – The extent to which repeated measurements, made on the same item under different conditions or by different people, produce the same result.

Resolution – Otherwise known as discrimination, the ability of the measurement system to adequately detect the smallest tolerable changes within the process.

Response Variable – An output variable that depends on input factors in a designed experiment. The response is the function of input variables.

Rolled Throughput Yield – Rolled Throughput Yield (YRT) is a yield metric that measures the probability that a unit of product will make it through a series of opportunities defect-free. It is calculated by multiplying the Throughput Yields at each opportunity for a defect.

Root Cause – A specific cause, usually a Process Input Variable (PIV) that has demonstrated a direct and significant influence on the Process Output Variable (POV).

Sample – A portion of the entire collection of data.

Sample Data – Observations made on items selected from a larger population.

Sample Size – The number of observations made or number of items selected from a larger population.

Sampling – Sampling is the process of selecting samples to estimate a characteristic of the population.

Scatter Plot – A type of plot used to study the relationship between two variables. It requires either continuous or discrete data. Each data point is plotted as a dot with a specific X and Y coordinate value.

Significance Level – The significance level (also called level of significance) of a hypothesis test is the maximum allowable probability of incorrectly rejecting the null hypothesis.

Simple Linear Regression – A technique in which a straight line is fitted to a set of data points to measure the effect of a single independent variable. The slope of the line is the measured impact of that variable.

SIPOC – Suppliers, Inputs, Process, Outputs and Customer. High-level process map produced as a result of project selection and definition.

Skewed – Data that is asymmetrical about the mean and is not normally distributed.

Special Cause Variation – An instance or event that impacts the process variation only under special circumstances in which the circumstance can be clearly identified, or an anomaly that is not part of the normal everyday variation encountered in the process. Sometimes referred to as unexpected variation.

Stability – Stability is the amount of variability in the bias over time. It indicates the extent to which a measurement remains constant and predictable over time with respect to accuracy and precision.

Standard Deviation – A measure of variability which describes the spread in a set of data. It is approximately the average deviation of a single data point from the mean of that data set.

Standard Normal Distribution – A special case of the normal distribution in which the mean equals zero and standard deviation equals one.

Target I-MR Chart – A type of Control Chart for variable data that plots individual data as a difference from a target and the moving range of the present and previous individual differences.

Target Proportion Value – A ratio, usually pertaining to a population, that is tested using a sample proportion.

Target Value – A value, usually a population parameter such as mean, standard deviation or proportion, that is tested using a sample.

T-Distribution – A symmetrical sampling distribution used to determine the confidence interval for means and to perform hypothesis tests on means.

Test of Association – A type of Chi-Square Hypothesis Test that examines the hypothesis of association (non-independence) between attribute variables. This procedure is used to test if the probabilities of items or subjects being classified for one variable depend upon the classification of the other variables.

Test of Independence – A type of Chi-Square Hypothesis Test that examines the hypothesis of association (non-independence) between attribute variables. This procedure is used to test if the probabilities of items or subjects being classified for one variable depend upon the classification of the other variables.

Test of One Standard Deviation Against a Constant – A type of hypothesis test for standard deviation that tests the standard deviation of a population (also known as target standard deviation) using the standard deviation of a sample.

Test of Several Proportions – A type of hypothesis test for proportions that examines whether or not two or more sample proportions of certain characteristics are independent of each other.

Test of Three or More Variances – A type of hypothesis test for standard deviation that examines the variance between three or more samples.

Test of Two Variances – A type of hypothesis test for standard deviation that examines the variance between two samples.

Test Statistic – A quantity calculated from a sample of data. The purpose of the test statistic is to obtain a p-value in a hypothesis test.

Throughput Yield – Throughput Yield (YTP) is a measure of process performance at an opportunity. It is a measure of the probability of that opportunity being correct with no rework.

Tolerance – The difference between the Upper Specification Limit (USL) and the Lower Specification Limit (LSL).

Traditional Yield – The ratio of number of units that ultimately pass through the entire process to the number of units that enter into the process. Equivalent to Final Yield, which equals output divided by input.

Treatment Combination – A specific combination of factor levels for each factor being tested in a Design of Experiments (DOE).

Trend – Gradual shift of data points in one direction.

Two-Sided Test – Hypothesis test in which the null hypothesis is rejected because the compared values are not equal to one another, regardless of whether one is smaller (or larger) than the other. For example, the alternative hypothesis for a Two-Sided Test will merely say p1 is not equal to p2, without any consideration for which is smaller (or larger).

Type I Error – The result of wrongly rejecting the null hypothesis.

Type II Error – The result of not rejecting the null hypothesis when it should have been rejected.

U Chart – A graphical display of number of defects per subgroup (part or unit) sampled with a constant or variable sample size so as to detect special causes.

UCL – See Upper Control Limit.

Uniform Distribution – Uniform distribution is a continuous probability density function, which is constant over an interval (say from “a” to “b”) and zero outside that interval. For example, while throwing a fair die, the probability of getting any number from 1 to 6 is the same (1/6) but the probability of getting a number less than 1 or more than 6 is none.

Upper Control Limit – The Upper Control Limit (UCL) is a limit in the Control Chart, set above the centerline, typically at +3 standard deviations, in order to determine if the process is in or out of control.

USL – Upper Specification Limit – The Upper Specification Limit (LSL) is the upper limit of a tolerance range specified by a customer requirement. A measured value that is above this limit is considered a defect.

Variable Data – Sometimes used in place of the term continuous data. Variable data is quantitative data that can be divided to record values at different levels of magnitude. Examples include time, distance and weight. Note: This definition is specific to Control Charts, as the data type is referred to as continuous data in other phases of the DMAIC methodology.

Variance – The spread of a set of data points, measured as the average of the squared differences between the data points and their mean. Also, the square of the standard deviation.

Variation – Used in reference to how data points are plotted with respect to a target value. Variation is commonly thought of as the measure of spread represented in a data set.

Variation Component Study – A designed experiment that allows one to partition sources of variability.

Versions – Iterations of a process map.

VOC – Voice of the Customer – Voice of the Customer (VOC) encompasses comments made by customers that describe how they feel about a company’s products, services and/or transactions. These comments typically have no associated measurable standards. The company’s responsibility is to communicate with its customers to identify these measurable standards, so that requirements can be met. These measurable standards become Customer Specification Limits.

VOP – Voice of the Process – Voice of the Process (VOP) encompasses the entire range of the output (Y) of a process when all of the X’s in the Y = f(x) equation have varied their full range.

X -Process Input Variable (PIV).

Y – Process Output Variable (POV).

Y = f(x) – The mathematical relationship that identifies the inputs (X’s) that need to be controlled and the levels they need to be set at to achieve the desired output (Y) to meet customer requirements.

Yield – The ratio of the number of units that meet certain criteria to the number of units that enter into the process. The criteria determines if a yield is Traditional Yield or another kind of yield.

YF – Final Yield.

YNORM – See Normalized Yield.

YRT – See Rolled Throughput Yield.

YTP – See Throughput Yield.

Z-Distribution – A special case of the normal distribution in which the mean equals zero and the standard deviation equals one.

Z-Score – Standardized score for measuring process performance relative to customer requirements.

Z-Table – A statistical table of probabilities associated with the Z-distribution or standard normal distribution.

Z-Transform – An equation that transfers a set of data represented by a normal distribution into a standard normal distribution (mean equals zero and standard deviation equals one) by subtracting the original mean from the data set and dividing the result by the original standard deviation.

2 Sigma – 308,537DPMO

3 Sigma – 66,807DPMO

4 Sigma – 6,210DPMO

5 Sigma – 233DPMO

6 Sigma – 3.4DPMO

  • Share/Bookmark

About the Author