Statistical Process Control

PBCharts Inflation Analysis

Wednesday, June 22, 2022 by Jon Peltier 2 Comments

Wednesday, June 22, 2022 by Jon Peltier
Peltier Technical Services, Inc., Copyright © 2023, All rights reserved.

Last week we released the PBCharts (Process Behavior Charts) tool for performing SPC control chart analysis. I’ve been working on PBCharts for well over a year, and we had an extended beta testing period, during which we cleaned up numerous issues. I thought I’d walk through an analysis to show how easily PBCharts does its work.

My colleague posted a quick analysis of inflation since the start of 2019, and that looked like a good data set to analyze. The data file CPI YoY Pct.xlsx (click to download) looks like this, showing date and year-over-year percentage increase of the consumer price index:

I’ve described the manual process of generating control charts in Introducing Control Charts (Run Charts), based on Donald Wheeler’s Understanding Variation, an excellent introductory text on this topic. It can be tedious to set up your own control charts, but PBCharts makes it easy to apply the techniques to lots of data.

PBCharts Data

PBCharts uses a specially formatted Excel template for its data, calculations, and charts. A PBCharts workbook has four visible worksheets: Data, Run Chart, MR Chart, and I Chart.

There are a few ways to get the data into the PBCharts file. The PBCharts ribbon presents us with a few options.

Import Data into PBCharts: Click this button, then browse to a data file (CSV, TXT, or Excel) and PBCharts will import data from the selected file into a new PBCharts file.

Analyze Selected Data in PBCharts: Select your data range, or select one cell in the data range, and click this button, and PBCharts will populate a new PBCharts file with the selected data, or if a single cell is selected, with the larger data range that contains the selected cell.

Blank PBCharts File: Click this button to create a new, blank PBCharts file, then paste your data into the Data sheet of the PBCharts file.

The inflation data looks like this in the PBCharts data worksheet:

In general, you will have more than one column of data to analyze. For example, the view below shows the inflation data above, with another column showing a three-month moving average. The cell colors tell you about the data: the gray cells indicate that the cell contains a formula, the red cells indicate that they are blank. The blue header cell tells you which column is currently being analyzed (PBCharts analyzes one column at a time). If you click on a different header cell, PBCharts will focus the analysis on the newly selected column.

Inflation data and more in the PBCharts file.

The PBCharts Ribbon

This isn’t really a manual for PBCharts, just a demonstration, but I’ll show the rest of the ribbon, so my discussion below makes sense. I’ve already shown the section that is used to manage PBCharts files.

The Analysis section of the ribbon includes controls for selecting the column to be analyzed, as well as controls for defining stages, trends, and excluded points, for labeling points, and more.

The Charts section provides a means to override axis scales, to set and show a target value, and to select a run test to apply. You can choose a chart style (bold, medium, or light), you can save charts as text files or export them to PowerPoint or Word, and you can define a dashboard-like layout of multiple charts.

The PBCharts Charts

Run Charts

Click on the Run Chart worksheet, and you see a run chart of the data in the highlighted column of the data sheet. It starts out pretty boring in 2019, then in early 2020 there is a dip, followed by another boring stretch, and starting in early 2021, inflation started ramping upwards.

You can select whether to test the run chart for certain conditions using the dropdown in the Charts section of the ribbon. The run chart above has no test.

The chart below has one test, Runs About Median, selected. The median is displayed as a green horizontal line, and every time the data crosses the median, a label is displayed along the top of the chart.

Our data displays 6 runs about the median. For a data set of this size, we would expect 21 runs about median, or between 17 and 26 with a p-value of 0.05. Since our runs are so low, the data is classified as clustering.

The chart below has another test, Runs Up or Down, selected. This counts each time the points reverse direction.

We would expect 27 runs up or down, or 23 to 31 with p=0.05, for a data set of this size. Instead, we see a much lower number, 14, which indicates trending behavior. Not surprising, given the obvious upward trend over the last third of the chart.

The run chart tests tell us that the data is not uniformly distributed, but we can still use control charts to analyze the behavior.

I (Individuals) and MR (Moving Range) Charts

We turn to the MR (Moving Range) Chart. This shows us how much the data changes from one value to the next and is a measure of the variability in the data. We see immediately that there are two peaks in the MR data, indicating large changes in the data, and possibly process changes.

The green horizontal line shows the mean moving range value, and the red horizontal line shows the upper natural process limit (or upper control limit) on the moving range. The red square points indicate points which are out of statistical control, that is, which violate one of the rules, and the red labels at the top of the chart show which rules have been violated. There are four possible rules for MR charts, shown below, and our analysis is checking two of them, in bold. The two obvious peaks exceed the upper control limit.

Assignable Cause Patterns (Rules) for MR Charts

I’ll show the I (Individuals or individual values) Chart, but it’s not usable yet. The green line shows the mean of the data, and the red lines show the upper and lower natural process limits (control limits). There’s a lot of red, which means that the average and control limits calculated so far do not describe the data well.

The red square markers indicate points which are out of control, and the text above the chart indicate which rules were violated. There are eight I Chart rules built into PBCharts, and we are checking four of them.

Assignable Cause Patterns (Rules) for I Charts

Dividing the Data into Stages

Let’s return to the MR chart and label a couple of points. I’ll label March 2020 “Covid” since it’s when we in the United States started severely limiting our activities due to the Covid pandemic. I’ll label March 2021 “Suez Canal” because that’s the month when the huge cargo ship ran aground in the Suez Canal, blocking all traffic. That event didn’t by itself cause the runaway inflation, but it marked the point when we became aware of how problems with shipping, supply chains, and logistics began to hurt the global economy.

Now back to the I chart. I’ll click the Stages button on the ribbon, and define two new stages starting in April 2020 and April 2021 (the first stage automatically starts at the beginning of the data set).

Next I’ll click the Trends button, and assign a linear trend to the third stage.

The I chart now looks well behaved. There is an initial horizontal section, then an intermediate horizontal section at a lower inflation value, indicative of reduced economic activity. Finally, there is the final increasing section. We still see a few red out-of-control points, three at the beginning of the middle Covid section, and one at the transition between the Covid static range and the final increasing stage.

The MR chart shows no point exceeding the control limit where Covid started, though we still have the over-the-limit point where global shipping became an issue. I’m not going to be concerned with these violations.

We could modify the I chart further, using the Exclusions dialog to exclude the three out-of-control points at the beginning of the Covid stage. Doing so causes three other points out-of-control, so I would probably just leave those points in the analysis.

The important thing to ask now, is whether the inflation rate will keep increasing. The US Treasury has raised its prime rate in March and again in May to try to slow inflation. There is no sign that this is having any effect, since the latest data points fall within the process limits and no other assignable causes are seen. But we can keep watching, add data every month, and follow the trends.

Locking Limits

PBCharts allows us to select a period of time, and lock in the calculated limits. It is common practice, once a process is deemed stable, to lock the limits before plotting subsequent points. This way adding more data will not affect the displayed limits, and the added data is directly compared to the stable process behavior.

We can use the inflation rate data to illustrate locking of calculated process limits. Since the Fed first increased interest rates in March, let’s back up the data to that point, ignoring the last two points. Here is the I chart.

Now I’ll lock the limits in the last stage (April 2021 to March 2022). The dark circles have changed to squares to indicate the points used to calculate the locked limits.

The limits are the same in both charts. Let’s view the final stage side by side.

Let’s add back the data for April and May. The difference in the two charts is subtle: notice that the last two circles are closer to the green line without locked limits (left) than to the green line with locked limits (right). This is because the added points have changed the slopes of the green lines and red limits, rotating the right sides of these lines slightly downward.

Now let’s suppose that the Fed’s interest rate action has had an effect. To simulate this, and I’ll admit it’s a huge assumption on my part, I’ll repeat the last two actual points for April and May as presumed additional points for June and July. The unlocked green and red lines of the left-hand chart below have rotated further, so that the green line is now below the points for April and May, and the imaginary points for June and July are above the lower red line. The locked limits in the right-hand chart haven’t moved. The points for April and May are still below the green line, and the new points for June and July are below the red line and highlighted as out-of-control points.

These out-of-control points are a signal that something in the process has changed. We might presume that increasing interest rates has stopped the climb in the inflation rate.

Get PBCharts

PBCharts is a VBA add-in for Excel. It is limited to Windows, but it runs on any supported versions of Excel, from Excel 2013 through Microsoft 365.

To get your copy of PBCharts, visit the PBCharts website. You can download PBCharts on a 14-day free trial basis, or you can purchase a one-year license for $99.00.

Statistical Process Control Articles in this Blog

Posted: Wednesday, June 22nd, 2022 under Utilities.
Tags: PBCharts, Statistical Process Control, Utilities.
Comments: 2

Watching my Weight with SPC (Statistical Process Control)

Tuesday, April 28, 2020 by Jon Peltier 8 Comments

Tuesday, April 28, 2020 by Jon Peltier
Peltier Technical Services, Inc., Copyright © 2023, All rights reserved.

I’ve been working on a Statistical Process Control project for a client, building a workbook to automate construction of control charts. Years ago I wrote a tutorial called Introducing Control Charts (Run Charts). Many processes, in manufacturing, in business, or in nature, show fluctuations in their outputs. We can use Statistical Process Control (SPC) techniques to monitor these processes and ensure the fluctuations stay within expected limits.

I was looking for data to proof out the tool I was building, and I thought I could use my weight as a decent data set. My wife bought a new digital scale in 2006, and I’ve been weighing myself almost every day since then. And being an Excel jock, I put my measurements into a spreadsheet.

In the chart below, you can see how I fluctuated around 200 lb for over a decade. Then 20 months ago my wife and I joined Weight Watchers, and over the course of 6 or 8 months I lost 40 lb.

I thought looking at the past few months would be a good way to illustrate the use of SPC to track a process. This exercise will construct a series of control charts of this data.

Learning about Statistical Process Control

I first learned about Statistical Process Control as a practitioner and as a trainer, while employed as a scientist/engineer for a large manufacturing corporation. One of the resources we had was a deceptively small book called Understanding Variation: The Key to Managing Chaos by Donald J. Wheeler.

Understanding Variation: The Key to Managing Chaos
by Donald J. Wheeler

There are many other information sources about SPC and control charts. The National Institute of Standards and Technology (NIST) has an online Engineering Statistics Handbook, which has a chapter on Univariate and Multivariate Control Charts. Wikipedia has brief articles with many references covering SPC and Control Charts. And Google shows about 1.2 billion results for SPC and 0.5 billion results for Control Charts.

Getting Started

Prepare the Data

The first step is to identify the data and get it into a form where it can be analyzed. I decided to track from 1-Sept-2019 to 1-Feb-2020. Below is the top of my data worksheet, with a few calculations. The data is in three columns of an Excel Table named Table_1. The first two columns are date and weight, manually entered. The third column is Moving Range (MR), which we will use as a measure of variability in the data. The formula in cell C2 and filled down the Table column is

=IFERROR(ABS([@Weight]-OFFSET([@Weight],-1,0)),NA())

Essentially it determines the absolute value of my change in weight from one day to the next. Any error in the calculation (such as trying to subtract the column header) returns NA(), or the #N/A error.

Weight data and preliminary calculations

I’ve calculated some values in a range beside the table, and I’ll explain them as I go along. The little table below the calculations show the formulas I’ve used. I’ve also named these cells as indicated, to make it easier to use the cells in formulas.

Chart the Data

The next step is to plot the data. I’ve made two charts, one of my weight, the other of the calculated moving range. We look first for any obvious issues in the data, such as the spike late in September. If you look at the data above, apparently I gained 18 lb one day, and lost it the next. A more likely explanation is that I transposed digits in 168 and instead entered 186 in the worksheet. I’ll deal with this data issue soon, but for now I’ll continue with the SPC construction.

I added the calculated items as columns in my Table to make it easier to chart them. Having named the cells, I could use simple formulas in the Table: =Mean in cell D2, =LCL in cell E2, etc.

Data table with calculated items — *click on image to enlarge*

Among my calculations are averages of the weight data (Mean) and of the moving range data (MR Bar). Let’s add these as green horizontal lines to the weight and MR charts for reference.

Compute Limits

So far, so good. Now let’s add a measure of “allowable” or “acceptable” variation. If the process is following statistical rules and its variability follows a normal distribution, we would use multiples of sigma, the standard deviation, to identify limits. According to the definition of a normal distribution, 68.3% of values fall within ±1 standard deviation of the mean, 95.5% fall within ±2 sigma, and 99.7% fall within ±3 sigma of the mean. By convention, 3 sigma is commonly used to identify acceptable variations.

We could measure the sample’s standard deviation (SD) directly, multiply it by 3, and use this to determine our limits. But using moving range is more robust, since outliers and non-normal distributions have a greater effect on sigma than on moving range.

The average moving range, or MR Bar, is used to calculate control limits. Less commonly, the median of the moving range is used to compute these limits.

First we determine MR UCL, which is the Upper Control Limit on the moving range, by multiplying the average moving range by 3.268. This is plotted to the moving range chart as a horizontal orange line (bottom chart below). We would expect 99.7% of our MR values to fall below this limit.

In the same way, we calculate the UCL and LCL (Upper and Lower Control Limits) of our individual data. We multiply MR Bar by 2.67, and add it to or subtract it from the mean to get our limits. These are plotted on our chart of individual values as horizontal orange lines (top chart below). Again, we expect 99.7% of our individuals to fall between these two lines.

IMR Chart = Combined Individuals and Moving Range Charts

These charts of measurements along with means and limits are called Control Charts. The chart of individual values is called an I Chart (no, not “eye chart”), and the moving range chart is the MR Chart. Together they are referred to as an IMR (sometimes ImR) Chart.

Our ±3 SD limits are shown in the dashed red lines below (they are calculated as LCL 2 and UCL 2). They fall pretty far outside the MR-based control limits. All points fall well within the SD-based limits, except for the one obvious outlier.

Standard Deviation and Moving Range based control limits

In fact, because the outlier causes two excessive moving range values, the MR-based limits are also too wide, and would lead us to accept points that would otherwise be out of control.

Clean Up Special Cause Variations

Special and Common Cause Variation

The spike in my weight in September is a “special cause” variation, because it is a one-off problem. Since it is obviously not a valid measurement, we can attribute it to a recording error, and ignore it. We want to remove this value from our moving range calculations, since it resulted in limits which were too wide.

The other variation we see in the timeline is “common cause” variation. It comes from variations in inputs, like exercise, meals, and other factors, which are themselves subject to normal variation.

Clean Up the Data

In my adjusted table below (Table_2), I’ve added two columns. Wt 2 simply repeats the data in Weight, using the Table formula =[@Weight]. I can replace any special cause deviation with =NA() or #N/A in this column. MR 2 uses the same formula as MR, based on the Wt 2 column:

=IFERROR(ABS([@[Wt 2]]-OFFSET([@[Wt 2]],-1,0)),NA())

Where there was one bad weight and two bad moving ranges, we now have #N/A values in the table, which we can ignore in the chart and in our other calculations.

Plot the New Data

When we plot our individual and moving range values, the chart scales now show much narrower ranges, and there are no longer any obvious outliers: there is one high individual value and corresponding moving range in January, a few low weights in November, and a few high weights in December.

Let’s add our means and control limits, and see what we have. The MR chart shows the outlying value in late January, and four more moving range values that are just at the limit. In the individuals chart, the low values are within the limits (“in control”) while the high values we eyeballed before are above the UCL (“out of control”).

When values are out of control, we have to examine the process, to ensure that nothing is wrong with our process, and that nothing has changed. I can actually explain some of the variations. On Thanksgiving, I ran a “Turkey Trot” with my daughter, so for a couple weeks I was running more than my usual 3 miles a day: thus the few low values in November. And of course, the few values of 172 coincide with the Christmas and New Year’s holidays.

Standard Deviation vs Moving Range

Below I’ve plotted the SD-based limits along with the MR-based limits. The limits are much closer to each other and closer to the mean than when the outlier was included in the calculations.

Here I’ve plotted these control limits as calculated with and without the outlier. The outlier had a substantial effect on the limits, especially on the SD limits.

Comparison of moving range based control limits and standard deviation based limits

When the variation fits a normal distribution, the two sets of limits are close together, with the MR-based limits wider sometimes and the SD-based limits wider other times. The larger the data set, the closer they will be.

For the rest of this analysis, I’ll ignore sigma and stick to MR-based calculations.

Highlighting Outliers

Enhanced Data

We can enhance our IMR Chart by highlighting points which are out of control. I’ve added two columns to my table to support this. Wt X has this formula

=IF(OR([@[Wt 2]]<=LCL,[@[Wt 2]]>=UCL),[@[Wt 2]],NA())

which shows the value from Wt 2 if it falls outside the control limits, and #N/A otherwise. MR X has this formula

=IF([@[MR 2]]>=MR_UCL,[@[MR 2]],NA())

which again shows the value from MR 2 if it falls above the control limit, otherwise #N/A.

Highlighting the Chart

I’ve added these columns to my IMR Chart as red/orange markers.

Additional Control Chart Rules

There are other features of control charts that indicate a process which is out of control. These are conditions which are not expected to be found in about 99.7% of cases. Here are a handful of common out-of-control rules; the first one is the one I highlighted above.

One point beyond 3-sigma control limits
2 of 3 points outside 2-sigma on same side of mean
4 of 5 points outside 1-sigma on same side of mean
8 consecutive points outside 1-sigma on both sides of mean
15 consecutive points inside 1-sigma on both sides of mean
9 consecutive points on same side of mean
6 consecutive points moving in same direction
14 consecutive points alternating up and down

Advanced SPC software highlights any of these situations, in addition to the 3-sigma violations.

Extending the Data

To show how to manage a growing data set, I added ten more weeks of my weight tracking.

Frozen Control Limits

Typically, when a process is determined to be steady, the limits are calculated and frozen, then these are extended forward. This is illustrated below: the frozen limits were calculated from September through February, indicated with solid lines, and extended into April, shown with dashed lines.

Where I had a few values above the UCL in December and January, I now had several below the LCL and only a few above the mean in February and beyond.

This is evidence of a process shift. Several of the additional rules mentioned at the end of the last section would have been triggered. Checking my exercise records gives us an explanation. For much of the period from September through January, I was running 3 miles a day, four or five days a week. The weather in February was rather mild, so I increased my mileage to about 3.5 miles a day, six days a week.

Moving (Variable) Limits

The control charts below show control limits calculated over the entire range. The process change is still noticeable, but it’s not as clear as with the frozen and extended limits above.

Another problem with continually recalculating limits is that the limits move over time. Points which were in control at one time may be pushed out of control by later measurements. A December point at 170 which was in control when the limits were frozen is now out of control under the newly computed limits.

Staged Analysis

We can overcome this concern by staging our analysis, that is, computing different limits for different subsets of our data. In my latest Table below, I’ve added a column named Stage, which contains 1 for the first stage and 2 for the second; these can be entered manually or with a formula, which for example increments the stage number on a given date. The control limits are computed separately for different stages.

The IMR Chart below shows a staged analysis. Stage 1 looks familiar; the UCL for both MR and Individuals are slightly lower because the large MR late in January coincided with the process change. The violations in stage 1 are the same as before; the few outliers in stage 2 would have been well within the stage 1 limits, but are actually above the stage 2 UCL.

It’s common practice not to compute a separate average moving range for all stages, especially if the stages have small numbers of points, but instead use an overall MR Bar. The chart below uses this combined measure of variation. Stage 1’s control limits are now a bit tighter, so the low weights measured during the Turkey Trot training in November are now outliers. Conversely, Stage 2’s control limits are slightly wider, so there are no outliers in Stage 2.

Statistical Process Control Articles in this Blog

Posted: Tuesday, April 28th, 2020 under SPC.
Tags: Control Charts, Run Charts, Statistical Process Control, Statistics.
Comments: 8

SPC Approach to Browser Stats

Wednesday, April 7, 2010 by Jon Peltier 14 Comments

Wednesday, April 7, 2010 by Jon Peltier
Peltier Technical Services, Inc., Copyright © 2023, All rights reserved.

In Web Browser Stats: Problems With Data Gaps I looked at my website statistics to evaluate how relative usage of Internet Explorer, Firefox, and Google Chrome has evolved over the past year and a half. For part of my analysis, I plotted SPC-type control charts of browser stats using a simple mean ± 3 SD approach to control limits. My colleague DaleW reminded me that my quick and dirty approach was not as good as a rigorous Shewhart Individuals control chart analysis. I should have known better; I even covered the individuals chart approach in Introducing Control Charts (Run Charts).

To review the approach, the raw data is plotted in two ways. The actual points are plotted in one chart, and the moving ranges (differences between points i and i-1) are plotted in another chart. A horizontal line is drawn on each chart at the mean of the data. Control limits are calculated using the moving range as a measure of variability instead of standard deviation. The upper control limit (UCL) of the moving range chart is calculated as 3.27 times the mean of the moving range, and this is plotted on the moving range chart. The upper and lower control limits (UCL and LCL) of the individual values is given my the mean of the individual values ± 2.66 times the mean of the moving range, and these are plotted on the individuals chart.

If there is a trend in the data, the moving range will be smaller than the standard deviation, because the basis for determining variability is difference from the previous point, rather than from the mean of all points.

Individuals-Moving Range Analysis

This table shows the individual values and moving ranges for the three main browsers. The means and control limits are computed below the table of values, and values in the table are colored red if they lie outside the control limits. The values show browser usage each month by percent of visits to my site.

There is a lot of red (i.e., out of statistical control) in the IE and Chrome individual values, notably at the beginning and end, indicating a trend from start to finish. Firefox shows only one red point, and there’s no obvious trend. The only red value in the moving range data is a single point for IE.

The data is plotted in the following I-MR charts. The Y axis ranges are the same for all browsers for easy comparison. The trends for Internet Explorer and Chrome are rather obvious when the new control limits are plotted.

For IE (above) the upper and lower control limits calculated using mean and standard deviation were 67.9% and 53.5%, much further apart than those in the I-MR chart; in fact, those limits fall outside the Y axis scale of the I-MR chart. For Chrome (below), the Mean-SD upper control limit is 10.4%, which also falls outside the corresponding I-MR chart. Both calculations for Chrome’s LCL are below zero; since this makes no physical sense, zero is used.

The Firefox control limits based on mean and SD are further apart than the I-MR limits, by one percentage point (32.4% and 27.0%), but would still be visible in this chart.

I-MR Analysis for Sparse Data

The conclusion from my earlier post was that three points over 18 months is insufficient data to judge whether there was a trend in the browser usage percentages. This conclusion holds when the more rigorous I-MR evaluation is carried out. If we perform the above analysis on four points, one point every six months, the I-MR calculations and charts show the processes are in control, and cannot be attributed to changing patterns of usage.

The moving range values are much larger than for monthly data points, since six months of changes are lumped into one point. As a result, the control limits are pushed far enough away from the means that there are no out-of-control points.

The data is plotted in the following I-MR charts. The Y axis ranges are the same for all browsers for easy comparison. Although we “see” trends for Internet Explorer and Chrome, since there are no points outside the control limits and not enough points to invoke the special Western Electric rules, we cannot conclude there is any variation not attributable to random fluctuations.

Statistical Process Control Articles in this Blog

Posted: Wednesday, April 7th, 2010 under SPC.
Tags: Control Charts, Run Charts, site statistics, Statistical Process Control.
Comments: 14

Microsoft MVP Logo

Types of Control Charts

Thursday, February 5, 2009 by Jon Peltier 12 Comments

Last year I wrote an Introduction to Control Charts (Run Charts). I referenced a favorite book of mine, Understanding Variation: The Key to Managing Chaos, by Donald Wheeler of SPC Press, an expert in Statistical Process Control. This book is a quick read, and it’s a great introduction to control charts, written clearly using layman’s terms, with a number of very good examples to illustrate their use.

In fact, in my last engineering job before becoming a full-time Excel jockey, I was frustrated by many operational features in the manufacturing facility where I worked. As I reread Wheeler’s book, I could relate many observed behaviors in the plant with examples in the book. Most examples, of course, showed misuse of reporting as well as optimization of separate departments to the detriment of the business as a whole. My employer talked the talk of SPC and Six Sigma, but nobody with any authority understood statistics or processes, so the company walked a random walk through the jargon of process control.

Types of Control Chart

A control chart, or run chart, is essentially a time series that shows variation in a process output over a period of time. The time may be defined by an actual date or time, or by a number indicating how many times the process has been carried out. Control charts were developed in the 1920s by Walter Shewhart while he was working at Bell Labs. Shewhart was investigating ways to improve product reliability by reducing and controlling manufacturing variability, and the control chart was Shewhart’s means of distinguishing random variability inherent in a process from “special causes” extrinsic to the process.

Wheeler’s book demonstrates statistical process control using the simplest type of control chart. This is called an Individuals chart, so named because it is based on individual measured values. The moving range (difference between sequential measurements) is used to calculate control limits for the chart. Individuals charts are also called XMR (or XmR) charts, to denote the individual X values and the moving ranges.

Time series data of individual measurements with control limits

Time series data of moving range with UCL
Individuals (XMR) chart, comprised of X chart (top) and R chart (bottom)
from Introduction to Control Charts (Run Charts)

The Individuals chart is used when individual values or rates are collected periodically. Examples of this type of data may be daily scrap rates in a production line, or a company’s monthly sales. Depending on the type of data being tracked, and on how it’s collected, there are actually several different types of control charts.

Other types of control charts are better suited to data which is collected in groups. While the Individuals chart plots the individual measured values and uses the moving range to provide statistical context, XbarR and XbarS charts are used when a batch of measurements comprise each sample. The average measurement from the group (Xbar) is plotted, while either the range (XbarR, for small batches) or sample standard deviation (XbarS, for larger sample sizes) of each sample is used for the calculation of control limits. I have seen one reference to the use of medians rather than means in these charts, but these variants must be rare.

If counts are measured (for example, the number of defective parts in a batch), several other control chart types have been developed to display the run chart behavior. These are known as P, C, nP, and u charts.

Selecting a Control Chart

To help sort out the different types of control charts, Wheeler has laid out the requirements of each in the form of a flow diagram. The only version I have of Wheeler’s diagram is a second-hand distorted GIF file, so I’ve redrawn the flow chart in both vertical and horizontal orientations. This is a handy reference if you are building your own control charts, but an SPC software package will automatically select the appropriate control chart based on the data provided.

Control Chart Selection Vertical Flow Chart

Vertical Flow Diagram for Selection of Proper Control Chart
Link to Image File – Link to PDF File

Control Chart Selection Horizontal Flow Chart

Horizontal Flow Diagram for Selection of Proper Control Chart
Link to Image File – Link to PDF File

In a series of posts I will show how to create each type of control chart in my favorite statistical process control software platform, and I will present real-world examples of each.

Interpreting Control Charts

I will also cover interpretation of control charts. A process is in control when its variability is described by the natural statistical variation defined by measurements taken over a period of time. When the process goes “out of control” (or “out of statistical control”), it indicates a possible change to the process, because the statistical rules determining the variability in the process have changed.

The natural process limits in a control chart are constructed to contain variations within 3 standard deviations above and below the mean, or 99.7% of naturally occurring variations. Any measurement outside this range is out of control, and special causes for this variation should be identified.

Obviously, a process is out of control when a measurement falls outside of the upper or lower control limits drawn on the control chart. The probability that this may occur naturally is 0.3%, sufficiently rare to warrant examination of the process when this occurs.

Other patterns in the data also merit special investigation. For example, too many consecutive measurements above or below the mean may indicate loss of statistical control of a process. Too many consecutive measurements that trend in one direction, or too many consecutive alternating measurements, may signal a process shift. A change in the process may be indicated by having too much, or not enough, variation within the ± 3 sigma limits.

Not enough variation sounds like a good thing, doesn’t it? Well, it probably is. Not all process change is necessarily bad. But if this change is not understood, then it cannot be reproduced, measured, and controlled. Thus, even if the change results in less variability, it is not a good change.

A number of rules have been established that define patterns that may indicate change in the underlying process. I will cover these special rules in a post in this series on the control chart. Software packages that produce run charts have these rules built in, but it is good to understand their basis.

Statistical Process Control Articles in this Blog

Posted: Thursday, February 5th, 2009 under SPC.
Tags: Control Charts, Run Charts, Statistical Process Control.
Comments: 12

Microsoft MVP Logo

Polynomial Fit vs. Statistical Process Control

Tuesday, October 7, 2008 by Jon Peltier 10 Comments

I’ve written a bit about regression and curve fitting; see Regression Approach to a Simple Physics Problem, Choosing a Trendline Type, and Trendline Fitting Errors. A blog reader asked for help with some sample data that he couldn’t fit. Here is the data.

I plotted the data and gave it the hairy eyeball. Not a linear trend, maybe something quadratic.

Attempted Regression

The blog reader had fitted a 6th order polynomial trendline, and was having trouble using it to predict values. My fit is shown below, and I had no such problems with predictions matching the trendline. I suspect the user had insufficient precision in his coefficients, which is covered in Trendline Fitting Errors.

The 6th order fit isn’t really all that great. I decided it really isn’t much better than the quadratic fit I had initially suspected.

Then I thought the data almost fit two line segments over different ranges of data. I’ve plotted these below.

I replied to the user with this suggestion, and he said that wouldn’t work, because the data would have to be fitted with many line segments, because the data he gave me was only part of a much larger sequence of values.

Run Charts

I thought a moment and realized that with many weeks of repeated data, what the user needed was an approach based on Statistical Process Control. I wrote about Control charts in Introducing Control Charts (Run Charts). This is an opportunity to illustrate another set of run charts. In this example, I relied on techniques from a small, 136-page book called Understanding Variation.

Understanding Variation - Wheeler
Understanding Variation: The Key to Managing Chaos
Donald J. Wheeler

I added a column to my table to calculate the Moving Range, which is simply the absolute value of the difference between the current value and the previous value. This is an easier measure of variation to compute than the standard deviation, though with modern computer hardware and software that’s not an important consideration.

In any case, I plotted the weekly values data and the moving range data.

I computed the averages of the values data and of the moving ranges. I added horizontal lines to indicate the averages (see Run Chart with Mean and Standard Deviation Lines for detailed instructions).

Then I used simple factors to determine upper and lower control limits for these quantities, and I added the limits to the charts. For the values, the control limits are given by:

Limit = Average Value ± 2.66 * Average Moving Range

For the moving range, the lower control limit is zero and the upper control limit is given by:

Limit = 3.27 * Average Moving Range

What this tells me is that the values and the moving ranges fall within limits, so the variability is given not by anything we can fit a curve to, but simply by normal variation within the process. Closer examination of some of the data would probably point to an out-of-control process (for example, the last five values show continuing decline). Let’s just worry about violations of the control limits.

I calculated 70 more values with the same mean and standard deviation as the original 10 values, to simulate an ongoing process (because the blog reader did not provide more data). I plotted these values on the same chart with the original ten values, using the limits calculated based on the original ten values.

The values look pretty good, all within the limit except for a single point, which should be examined for any special causes of variation. All of the moving range points fall within the upper control limit. I recalculated the averages and limits using the entire data set and replotted the data.

There was little difference; the limits were slightly more generous. The value that exceeded the control limit in the first chart of all the data still is out of control, and still deserves a closer look.

One final note: The polynomial regression breaks down completely in a process like this which is successfully modeled using SPC. A linear fit may be useful to detect a possible trend of the average over time.

Statistical Process Control Articles in this Blog

Trendline and Regression Articles in this Blog

Posted: Tuesday, October 7th, 2008 under Statistics.
Tags: Control Charts, Formatting, Run Charts, Statistical Process Control, Statistics, Trendlines.
Comments: 10

Microsoft MVP Logo

Statistical Process Control

PBCharts Data

The PBCharts Ribbon

The PBCharts Charts

Run Charts

I (Individuals) and MR (Moving Range) Charts

Dividing the Data into Stages

Locking Limits

Get PBCharts

Further Reading about Statistical Process Control

Statistical Process Control Articles in this Blog

Learning about Statistical Process Control

Getting Started

Prepare the Data

Chart the Data

Compute Limits

Clean Up Special Cause Variations

Special and Common Cause Variation

Clean Up the Data

Plot the New Data

Standard Deviation vs Moving Range

Highlighting Outliers

Enhanced Data

Highlighting the Chart

Additional Control Chart Rules

Extending the Data

Frozen Control Limits

Moving (Variable) Limits

Staged Analysis

Further Reading about Statistical Process Control

Statistical Process Control Articles in this Blog

Individuals-Moving Range Analysis

I-MR Analysis for Sparse Data

Further Reading about Statistical Process Control

Statistical Process Control Articles in this Blog

Types of Control Chart

Selecting a Control Chart

Interpreting Control Charts

Further Reading about Statistical Process Control

Statistical Process Control Articles in this Blog

Attempted Regression

Run Charts

Further Reading about Statistical Process Control

Statistical Process Control Articles in this Blog

Trendline and Regression Articles in this Blog