Last year I wrote an Introduction to Control Charts (Run Charts). I referenced a favorite book of mine, Understanding Variation: The Key to Managing Chaos, by Donald Wheeler of SPC Press, an expert in Statistical Process Control. This book is a quick read, and it’s a great introduction to control charts, written clearly using layman’s terms, with a number of very good examples to illustrate their use.
In fact, in my last engineering job before becoming a full-time Excel jockey, I was frustrated by many operational features in the manufacturing facility where I worked. As I reread Wheeler’s book, I could relate many observed behaviors in the plant with examples in the book. Most examples, of course, showed misuse of reporting as well as optimization of separate departments to the detriment of the business as a whole. My employer talked the talk of SPC and Six Sigma, but nobody with any authority understood statistics or processes, so the company walked a random walk through the jargon of process control.
Types of Control Chart
A control chart, or run chart, is essentially a time series that shows variation in a process output over a period of time. The time may be defined by an actual date or time, or by a number indicating how many times the process has been carried out. Control charts were developed in the 1920s by Walter Shewhart while he was working at Bell Labs. Shewhart was investigating ways to improve product reliability by reducing and controlling manufacturing variability, and the control chart was Shewhart’s means of distinguishing random variability inherent in a process from “special causes” extrinsic to the process.
Wheeler’s book demonstrates statistical process control using the simplest type of control chart. This is called an Individuals chart, so named because it is based on individual measured values. The moving range (difference between sequential measurements) is used to calculate control limits for the chart. Individuals charts are also called XMR (or XmR) charts, to denote the individual X values and the moving ranges.
Individuals (XMR) chart, comprised of X chart (top) and R chart (bottom)
from Introduction to Control Charts (Run Charts)
The Individuals chart is used when individual values or rates are collected periodically. Examples of this type of data may be daily scrap rates in a production line, or a company’s monthly sales. Depending on the type of data being tracked, and on how it’s collected, there are actually several different types of control charts.
Other types of control charts are better suited to data which is collected in groups. While the Individuals chart plots the individual measured values and uses the moving range to provide statistical context, XbarR and XbarS charts are used when a batch of measurements comprise each sample. The average measurement from the group (Xbar) is plotted, while either the range (XbarR, for small batches) or sample standard deviation (XbarS, for larger sample sizes) of each sample is used for the calculation of control limits. I have seen one reference to the use of medians rather than means in these charts, but these variants must be rare.
If counts are measured (for example, the number of defective parts in a batch), several other control chart types have been developed to display the run chart behavior. These are known as P, C, nP, and u charts.
Selecting a Control Chart
To help sort out the different types of control charts, Wheeler has laid out the requirements of each in the form of a flow diagram. The only version I have of Wheeler’s diagram is a second-hand distorted GIF file, so I’ve redrawn the flow chart in both vertical and horizontal orientations. This is a handy reference if you are building your own control charts, but an SPC software package will automatically select the appropriate control chart based on the data provided.
Vertical Flow Diagram for Selection of Proper Control Chart
Link to Image File – Link to PDF File
Horizontal Flow Diagram for Selection of Proper Control Chart
Link to Image File – Link to PDF File
In a series of posts I will show how to create each type of control chart in my favorite statistical process control software platform, and I will present real-world examples of each.
Interpreting Control Charts
I will also cover interpretation of control charts. A process is in control when its variability is described by the natural statistical variation defined by measurements taken over a period of time. When the process goes “out of control” (or “out of statistical control”), it indicates a possible change to the process, because the statistical rules determining the variability in the process have changed.
The natural process limits in a control chart are constructed to contain variations within 3 standard deviations above and below the mean, or 99.7% of naturally occurring variations. Any measurement outside this range is out of control, and special causes for this variation should be identified.
Obviously, a process is out of control when a measurement falls outside of the upper or lower control limits drawn on the control chart. The probability that this may occur naturally is 0.3%, sufficiently rare to warrant examination of the process when this occurs.
Other patterns in the data also merit special investigation. For example, too many consecutive measurements above or below the mean may indicate loss of statistical control of a process. Too many consecutive measurements that trend in one direction, or too many consecutive alternating measurements, may signal a process shift. A change in the process may be indicated by having too much, or not enough, variation within the ± 3 sigma limits.
Not enough variation sounds like a good thing, doesn’t it? Well, it probably is. Not all process change is necessarily bad. But if this change is not understood, then it cannot be reproduced, measured, and controlled. Thus, even if the change results in less variability, it is not a good change.
A number of rules have been established that define patterns that may indicate change in the underlying process. I will cover these special rules in a post in this series on the control chart. Software packages that produce run charts have these rules built in, but it is good to understand their basis.
Further Reading about Statistical Process Control
- ISO 9001 – Introduction to SPC
- Control Charts on Wikipedia
- Interpreting Control Charts
- Selecting the Right Control Chart
Jan Karel Pieterse says
Hi Jon,
Nice post, look forward to the next parts!
I’m a big DoE fan myself and had a crash-course on SPC a long time ago.
“My employer talked the talk of SPC and Six Sigma, but nobody with any authority understood statistics or processes, so the company walked a random walk through the jargon of process control.”
Sounds like my last job too :-(
Matt Healy says
Way back in the 1980s in a previous life as an engineer, one day the guy who did QA on incoming parts came to my boss. On the control chart he had noticed that a certain part whose diameter had in the past fluctuated randomly around the nominal value had recently begun showing a trend of small but steady increase in diameter from lot to lot. The latest batches WERE still well within specifications, but if the trend continued he would soon have to reject a batch.
We followed up with the vendor, who was able to identify and correct a process problem and stop the trend. What I found instructive about this episode was how, simply by keeping a history of his measurements, our QA guy was able to spot a trend before any out-of-tolerance parts were made, so we never had to reject any parts.
Jon Peltier says
An interesting factoid I came across last week is that machining tolerances are not well characterized by any of the usual control charts. There is typically a trend in data as tools wear, so a machined diameter would shift gradually as the part number within a lot increased. Machinists will do the XbarS analysis because it’s required, then they’ll do an analysis tailored to measurements which change systematically. Details about the analysis were behind a login screen, so I couldn’t find out what’s involved.
Matt Healy says
In my 1980s case, I believe the only statistical tools used were graph paper and a Mark One Eyeball.
Colin Banfield says
Interesting topic….Understanding Variation was an eye opener for me, although it did end with a few unanswered questions. For example, what are the control limit calculations based on? This question nagged me throughout the book, so I undertook an interesting journey to find out.
I found articles that explained that the control limits on the XmR chart (for example) are based on the fact that the XmR chart assumes a normal distribution, and for a normal distribution, more than 99% of the points fall within 3-sigma limits….Hmmm, OK, that appears to make sense….but how does that fit in with Dr. Wheeler’s calculation of the limits? I then came across another article that explained how to calculate the 3-sigma limits approximately. It turned out that the formulas used for the approximate calculation were the same used by Dr. Wheeler! So finally, in a roundabout way, I discovered that Dr. Wheeler’s control limits were 3-sigma limits using an approximate calculation. Whew! Well, why not calculate the 3-sigma limits directly? This is of course easy to do in Excel, but the approximate formulas were around during the days of hand calculations (the two don’t provide the same results but they’re probably close enough so that the difference can be ignored for using control charts in the real world).
But wait a minute! After I thought that I had all this stuff figured out, another question began to nag me. The XmR chart was shown to be useful in everyday situations, such as examining sales data over time. The question then was; how on earth can we assume that any of these everyday situations follow a normal distribution? This wasn’t making sense to me.
Luckily, I uncovered an article by Dr. Wheeler titled, “Shewhart’s Charts and the Probability Approach.” This article explained a lot. The first thing is that Shewhart’s use of the 3-sigma limits was only loosely based on the normal distribution. The second is that, in real life, you can’t model a process exactly and third, the use of normal distribution to explain the use of a particular control chart is wrong. Shewhart’s use of 3-sigma limits, while having a basis in probability, was selected because these limits work in practice to minimize the errors of identifying a random variation for a signal and vice versa. Thus, the fact that you don’t have to assume a normal distribution (or any other kind of model), explains why the technique could work for everyday situations. Darn! Sections of this article could have been included in the Introduction chapter of Understanding Variation to put the rest of the book in its proper context.
So is that the end of the story? Well…almost. I subsequently discovered that the use of the mR chart with the individual X chart is actually highly controversial. This controversy is summarized in an article titled “Individual Charts and Additional Tests for Changes in Spread” by Albert Trip and Jaap Wieringa. Although Dr. Wheeler is firmly in the camp supporting mR charts, I haven’t seen an article by him debunking the other side of the controversy.
Ok, after the above lengthy tirade, I think that control charts work well in practice. It’s ironic that that the scorecards you see in so many BI solutions show the same out of context comparisons that Dr. Wheeler talks about in Understanding Variation. Yet, these BI products don’t provide the contextual analysis (via control charts) needed to understand the scorecard comparisons.
Sorry for the long post.
Jon Peltier says
Colin –
Nice thoughtful stroll through the intricacies of SPC.
Regarding the normal distribution, I think it’s the central limit theorem that tells us that the sum of a bunch of independent errors, the kind which are caused by uncontrolled factors in a process and which result in the variability of the output of that process, are defined by a normal distribution. It’s kind of a catch-all fudge factor, but it helps to explain why the assumption of a normal distribution isn’t too crazy.
Regarding BI software, it seems the vendors take the quick overview, pulling out some items to display out of context, using the fanciest graphics available, not those sanctioned by the experts in effective graphical communication. The BI programmers are knowledgeable neither in human cognition nor in statistics and SPC. So they miss out on two important points.
Darlene says
Hi Jon, I am hoping you can help me out. I am totally frustrated, have looked through books, been on Microsoft’s Discussion Group site but I have not been able to solve this problem. I have made a Pivot Table (first one) and Pivot Chart. The table will have data added to it on a monthly basis. The chart is a stacked chart with dual axis showing – one shows percentage the other total amount. Looks great BUT when I add data to the table and refresh, all my formats change back to a chart. Also, when I click on the mont field to show data from just a certain month, it all reverts back to a column sereis and the secondary axis is gone. I am using Excel 2003. Is this the problem?
Thanking you in advance for your expertise.
Darlene
Jon Peltier says
Darlene –
This is an issue with pivot charts, in every version of Excel that has pivot charts (2000 and later). In Changing a PivotChart removes series formatting in Excel Microsoft admits it’s an issue, and their suggestion is to record a macro next time you have to fix the chart. Thereafter, all you need to do is run the macro when necessary.
Darlene says
Thanks Jon. I can’t believe the time I spent on trying to figure this one. Can you direct me on a macro? Does that mean everytime I add data, or click on the month series, I have to run the macro? Not familiar with macros. You know I’m a newbie!
Darlene
Jon Peltier says
Darlene –
You pretty much have to do your own macro, since your configuration will be unique. Here are a couple of articles which may help:
How To: Record Your Own Macro
How To: Fix a Recorded Macro
How To: Assign a Macro to a Button or Shape
DaleW says
Colin & Jon –
Great book by Donald Wheeler. I’m over a year late to this SPC discussion, but SPC charts would be much less powerful if they had been based on the overall calculated stdev() instead of what might be called an approximation for standard deviation. SPC works because it uses somewhat less efficient but much more robust estimators for a normal distribution’s sigma at the local data scale instead of the most efficient but brittle global estimator stdev().
The recent post on SPC Approach to Browser Stats shows a nice example of how much better SPC does because of its robust estimators. StDev() includes the variation due to a trend line and makes for much less convincing charts, but SPC sigma ignores almost all of that special cause, giving SPC charts the power to clearly detect the special cause.
As the underlying distribution becomes decidedly non-normal, I or I-MR charts stop working so well, and Xm-R charts (perhaps with increasingly large subgroups) are needed to take advantage of the central limit theorem and normalize the variable being plotted. (None of the controversy about whether to show the MR charts or not has anything to do with how the I chart limits are calculated — it’s still based on an average moving range, or a median moving range if even more robustness against outliers is desired.)