Essential Data Collection for Quality Improvement

According to the definition by Wikipedia, data collection is the process of gathering and measuring targeted variables in an established systematic way, which then enables one to draw a conclusion from the data. Data collection is the core component of research in all fields of study.   While data collection methods vary by discipline, the emphasis on ensuring accurate and honest data remains the same.

The above statement applied to quality engineering where data based decision making is the mainstream of quality engineering esp in quality  improvement.  Data collection process is also the most difficult and challenging part for any quality improvement study.
In manufacturing process there are 2 types of numeric data  :-
  • Attribute data
  • Variable data

Also known as
Cannot be divided have decimal point ,  whole round number
Clear operational definition of
Go, no go,
reject, accept
pass, fail

Generated from a calibrated measuring device and can be divided infinitely
Distance, Length, weight
Counting of good and defect parts after going thru inspection such as visual or using go no go jig
Measure using measurement device
Yield rate
Defect rate
Count of reject
Dimension Length, width thickness,  strength, weight etc

In order to collect useful and meaningful data we must use a structural step by step process :-

  1. Identify what is data need to collected that represent quality characteristic of the product or subject you would like to study
  2. Identify source of data, either you have to collect from current production lot or based on historical data.
  3. A description and quantification of the dimension, quantity, capacity, performance or other characteristic of a population or process eg. length, weight etc.
  4. The ways we quantify data:   How much (e.g., dimensions);  How many (e.g., counts); How long (time); How good (performance); Characteristic (e.g., color, location) etc.
  5. Understand the assessment system.  Conduct GR&R (refer to my previous article Importance of Performing Gage Repeatability and Reproducibility (GR&R)  before actual quality data collection part 1 and part 2)
  6. If the GR&R does NOT meet the spec, then historical data cannot be used and we must collect new set of data after improving GR&R of the assessment system.  Data which come from poor measurement system cannot does not reflect the true product quality and cannot be used!
  7. Determine the appropriate sample size if we are unable to collect data for every single part in production.  Sample must represent population as we would need to make decision on the population base on sample data analysis.
  8. Data collected need to turn into useful information which can be used to make decision. Decide the tools use to analyze the data collected.  We can use conventional excel spreadsheet or statistic software such as SAS JMP, minitab or Statsoft Statistica for data analysis.
  9. Usually  we would record the data in excel format  and use statistical software to analyze the data later on.  If this the case we must setup the excel table column per analysis format of statistic software format so we can easily cut and paste to process in statistic software.
  10. Finally, start the actual data collection process

If we are dealing with very big data set of data,  it would be helpful to use data mining software which have the capability to organize data into structures and look for current pattern or trend to predict the future trend.  Please bear in mind no matter  what data or how big the data we are collecting it is still a sample data as most instances we cannot collect data for whole population and it is impossible to measure future parts.

After we have collected  the sample data (raw data),  then we will need to process or summarize the data using statistic to understand the data distribution,  mean and spread of the data in most cases esp in manufacturing process data.  We will process the data into graphic format to communicate to management.  The example of the graphics format are bar charts, pie chart, box plot, scatter plot, pareto chart, trend chart,  etc.  From the step by step analysis of the sample data,  it will give us an insight of how the population could behave and enable us to make good decision for the population which is also the emphasis of six sigma methodology.

Reference: (

Sharing is caring

Continuous Improvement Program CIP - 6sigma Methodology