According to the definition by Wikipedia, data collection
is
the process of gathering and measuring targeted variables in an established
systematic way, which then enables one to draw a conclusion from the data. Data
collection is the core component of research in all fields of study. While data
collection methods vary by discipline, the emphasis on ensuring accurate and
honest data remains the same.
The above statement applied to quality engineering where
data based decision making is the mainstream of quality engineering esp in quality improvement.
Data collection process is also the most difficult and challenging part
for any quality improvement study.
In manufacturing process there are 2 types of numeric
data :-
- Attribute data
- Variable data
Data
|
Attribute
|
Variable
|
Also
known as
|
Discrete
|
Continuous
|
Characteristics
|
Cannot
be divided have decimal point , whole
round number
Clear
operational definition of
Go,
no go,
reject,
accept
pass,
fail
|
Generated
from a calibrated measuring device and can be divided infinitely
Distance,
Length, weight
|
Method
|
Counting
of good and defect parts after going thru inspection such as visual or using
go no go jig
|
Measure
using measurement device
|
Example
|
Yield
rate
Defect
rate
Count
of reject
|
Dimension
Length, width thickness, strength,
weight etc
|
In order to collect useful and meaningful data we must
use a structural step by step process :-
- Identify what is data need to collected that represent quality characteristic of the product or subject you would like to study
- Identify source of data, either you have to collect from current production lot or based on historical data.
- A description and quantification of the dimension, quantity, capacity, performance or other characteristic of a population or process eg. length, weight etc.
- The ways we quantify data: How much (e.g., dimensions); How many (e.g., counts); How long (time); How good (performance); Characteristic (e.g., color, location) etc.
- Understand the assessment system. Conduct GR&R (refer to my previous article Importance of Performing Gage Repeatability and Reproducibility (GR&R) before actual quality data collection part 1 and part 2)
- If the GR&R does NOT meet the spec, then historical data cannot be used and we must collect new set of data after improving GR&R of the assessment system. Data which come from poor measurement system cannot does not reflect the true product quality and cannot be used!
- Determine the appropriate sample size if we are unable to collect data for every single part in production. Sample must represent population as we would need to make decision on the population base on sample data analysis.
- Data collected need to turn into useful information which can be used to make decision. Decide the tools use to analyze the data collected. We can use conventional excel spreadsheet or statistic software such as SAS JMP, minitab or Statsoft Statistica for data analysis.
- Usually we would record the data in excel format and use statistical software to analyze the data later on. If this the case we must setup the excel table column per analysis format of statistic software format so we can easily cut and paste to process in statistic software.
- Finally, start the actual data collection process
If we are dealing with very big data set of data, it would be helpful to use data mining
software which have the capability to organize data into structures and look
for current pattern or trend to predict the future trend. Please bear in mind no matter what data or how big the data we are
collecting it is still a sample data as most instances we cannot collect data
for whole population and it is impossible to measure future parts.
After we have collected the sample data (raw data), then we will need to process or summarize the
data using statistic to understand the data distribution, mean and spread of the data in most cases esp
in manufacturing process data. We will
process the data into graphic format to communicate to management. The example of the graphics format are bar
charts, pie chart, box plot, scatter plot, pareto chart, trend chart, etc.
From the step by step analysis of the sample data, it will give us an insight of how the
population could behave and enable us to make good decision for the population
which is also the emphasis of six sigma methodology.
Reference:
(https://www.pinterest.com/jillthompsong/stats-jokes/)