From xkcd.com, “A webcomic of romance, sarcasm, math, and language.”
I feel a Homer moment coming on… Mmmmm, bacon.
From xkcd.com, “A webcomic of romance, sarcasm, math, and language.”
I feel a Homer moment coming on… Mmmmm, bacon.
Introduction to Control Charts
While employed as a scientist/engineer for a large manufacturing corporation, I pulled a stint as trainer for my company’s Total Quality program. This role occupied half of my time for three or four years. A large component of our Total Quality initiative was related to Statistical Process Control (SPC). And of all the volumes of course materials used in this training, the most useful was a small, 136-page book called Understanding Variation.
What made the book so useful was its down-to-earth approach to SPC and its reliance on real business and manufacturing examples.
After discussing the randomness but orderliness of data, the book introduces time series charts. Basically, a record of performance is put on paper, in graphical form, so a quick glance can show whether things are on track. Once you have such a chart, you need a system that helps indicate how much variation is acceptable, and how much means the process is out of control.
Let’s take a look at some sample data. I’ve generated a couple of months of normally distributed process data, which looks like this:
The gray vertical lines separate adjacent weeks. Without any special training, one could look at this data, see that it varies, and say that “some days are better than others”. But we have no way to characterize the variation. Walter Shewhart first applied some simple but effective numerical techniques to this type of data, and he called the result “control charts”. While the techniques were relatively simple to apply, they were based on solid statistical methods.
The first step is to add a line to the chart above representing the mean of the data:
We still can’t say anything more insightful than the data is distributed about the mean. In order to judge the variability in the data, we need first to quantify the variability. This can be done by analyzing a moving range chart, which is an effective measure of variation in the data, and simpler to compute than the standard deviation (at least in the days prior to desktop computers). Essentially, for a given day, we plot the absolute difference between that day’s value and the previous day’s value.
We can add a line for the mean as we did with the chart of raw values:
We still don’t know how much variability is “acceptable”, that is, attributable to common variation in the process. This is where some calculations are required. When using the moving range to determine “typical” variation, the maximum acceptable value of the moving range is 3.27 times the mean of the moving range. For this data, the mean of the moving range is 3.2, so the maximum acceptable moving range, or the Upper Control Limit for Ranges, is 10.5.
The variability in moving range is well under the Upper Control Limit (UCL). We can use the moving range to compute the control limits on the individual values data: the mean of the moving range is multiplied by 2.66, then added to and subtracted from the process data mean (this is roughly equivalent to using 3 times the standard deviation of the individual values data). Our Upper and Lower Control Limits (UCL and LCL) are thus 28.1 and 11.1.
In conjunction, these two graphs comprise an XMR chart, because it analyzes the X (individual values) and MR (moving range) data.
A quick glance at these charts shows that the process data falls well within the upper and lower control limits, that is, it is operating within the bounds of random variation, and we can say that the process is “In Control”. In truth, there are additional tests for randomness, because any patterns in the data even within the control limits means the process may be “Out of Control”. A detailed treatment of these additional tests is beyond the scope of this introduction.
In addition to the XMR methodology, there are several other types of control charts, which are used for different types of data and data collection. I plan to describe them in a future post.
Excel Control Charts on the Peltier Tech Web Site
This example shows how to build a run chart with a mean and lines for one standard deviation above and below the mean. For an “official” run chart calculated based on mean and standard deviation, the upper and lower control limits are usually set at ±3 SD. It is a simple matter to adjust the example to show any control limits you have calculated.
This links to a zipped workbook that uses VBA code to adjust the chart when the data changes. The positions of the lines are adjusted for changes in the data, and the points are formatted so that points below the mean are filled with blue and those above white, while diamonds are used for points within one SD of the mean, circles for two SD, and triangles for three SD; points outside of the control limits are red squares.
Historical note: This was the first chart I ever formatted point-by-point with VBA. Back in Excel 97 on an underpowered Windows 95 laptop, it was very slow, taking almost a minute to process the chart as seen above. It happens in a flash in Excel 2003 on a computer with enough juice to support Windows XP.
Commercial Statistical Process Control Software
If you have a need for extensive Statistical Process Control capabilities, then you are not likely to want to build an entire solution yourself. There are a number of commercial packages that provide SPC functionality for Excel. There is one in particular which I will mention here: SPC for Excel. Written and supported by a client of mine, Business Process Improvement, or BPI Consulting, SPC for Excel is a turnkey SPC system in the form of an Excel add-in. SPC for Excel is compatible with Excel versions 2000 through 2007; this reduced view of its Excel 2007 ribbon tab shows the capabilities of SPC for Excel.
Statistical Process Control (SPC) is a set of statistical and related methods for monitoring processes with an aim to improve productivity and reduce costs, time, and waste incurred by these processes. In fact, SPC is a philosophy surrounding the monitoring, analysis, and adjustment of process variables to produce continuous improvements in the process.
There are a lot of terms that mean SPC, or that are a part of SPC. I plan to address some of these topics in more detail in future posts.
Statistical tools commonly used in SPC include the following (with links to pages on the Peltier Tech web site):
I will discuss several of these charting techniques in future posts.
Here are a few online resources you can visit to learn more about SPC.
Stay tuned for my follow-up posts on SPC. I will build a running index on this page as I add topics.
Excel is a mighty powerful application, and yet, there are any number of utilities available to extend its capabilities. I’m sure everyone has their favorites, and here I’m going to talk about mine. My reliance on these utilities is as a developer of Excel applications, that is, solutions that combine add-ins, templates, and regular workbook to accomplish specific objectives within specific operating parameters. My favorite utilities lean toward developer tools, rather than worksheet productivity tools. This is an incomplete list: there must be hundreds of worthy utilities written by dozens of developers, but these are the ones I have tried and kept. If you think I’ve left out an important utility, tell me about it in the comments.
Two of my favorite tools were developed by fellow Microsoft Excel MVP Jan Karel Pieterse. He has a variety of utilities on his web site, JKP Application Development Services, as well as a diverse collection of articles and tutorials. I use Jan Karel’s Name Manager and AutoSafe. Jan Karel has offered me a percentage of sales on these two utilities, but that won’t amount to much, because they are free.
The Name Manager is a high-powered tool for working with all of the Names in your workbooks. You can view all names, or filter by a number of conditions. You can make Names hidden or visible, local to the worksheet or global to the workbook. You can add and delete Names, and see what ranges they refer to. Especially powerful is the ability to work on many Names at once. This utility can save a developer many hours.
AutoSafe is what Microsoft was thinking when they implemented Autosave. AutoSafe saves a copy of your open workbooks in a directory you specify, without overwriting the workbooks. If Excel shuts down improperly, AutoSafe offers to restore the files in use prior to the crash. When a new copy is saved, the existing copy is removed to the recycle bin. In this way, you can maintain an informal version history in the recycle bin, as well as track your activities. This utility has saved me many times.
Microsoft Excel MVP and author Rob Bovey, of Application Professionals, offers a handful of free utilities. I use Rob’s Chart Labeler and VBA Code Cleaner utilities.
Rob calls this the XY Chart Labeler, but it really works for any chart that accommodates data labels on its plotted series. Excel allows you to label points with the series name, category labels (or X values), or Y values, and once you’ve added labels, it allows you to customize those labels. But for some reason Excel lacks the capability to apply a set of labels from the worksheet to a series on the chart. The Chart Labeler addresses this shortcoming, allowing you to apply custom labels one point at a time, or one series at a time. The labels on the points remain linked to the cells containing the labels, so the chart labels change when the labels in the cells are changed. The Chart Labeler also allows fine positioning of one or a series of labels. I’ve heard that this is the most downloaded Excel utility, and I’m not surprised. It’s been around since VBA was introduced to Excel, and it’s useful for anyone from casual users to hard core developers.
VBA Code Cleaner
Rob says it best:
“During the process of creating VBA programs a lot of junk code builds up in your files. If you don’t clean your files periodically you will begin to experience strange problems caused by this extra baggage. Cleaning a project involves exporting the contents of all its VBComponents to text files, deleting the components and then importing the components back from the text files.”
Microsoft Excel MVP and author Stephen Bullen offers many example workbooks and free utilities on his Office Automation web site (Stephen and Rob have collaborated on several books). I use Stephen’s Smart Indenter and VBE Tools utilities when I’m developing in VBA.
The Smart Indenter restores an orderly appearance of your code modules by indenting each line. The indentation style is highly customizable. I use this very frequently while writing code, and often the first thing I do when I receive a workbook from someone else is run the Smart Indenter on its code. The Smart Indenter has versions for VBA and for VB6.
VBE Tools provides a set of enhancements to the Visual Basic Environment. It adds a toolbar that shows the size of the active code module and that adds “Nudge” buttons that provide fine positioning control of objects on a user form. It also wraps the location of files in the References dialog, so you can actually see the entire path and file name of the reference. This feature materialized after Stephen heard me whining about the truncated reference file name problem.
John Walkenbach, Excel MVP and author of dozens of spreadsheet books, has an extensive website with tips for developers and for regular users. “Mr Spreadsheet” also hosts the very popular non-Excel J-Walk Blog, and he has a number of utilities on his site.
includes a labeling feature similar to Rob’s (above). It also includes features that export charts as image files, convert charts to embedded pictures, and create a report detailing one or all charts on a sheet.
Power Utility Pak is an extensive set of general-purpose tools that make working in Excel much easier. PUP includes tools for Formatting, Formulas, Charts, Ranges, Worksheets, Workbooks, and much more. This is the only Excel utility listed here with a registration fee, but the fee is nominal, and includes free upgrades for life. For an even smaller nominal fee, the user gains access to the VBA code behind the utilities, for educational purposes only (of course).
ASAP Utilities (“As Soon As Possible”) by Bastien Mensink offers a broad set of tools that enhance your productivity in Excel. It is similar in scope to the Power Utility Pak, but remarkably there is not a lot of overlap between them. When my work was primarily as a regular Excel user, I had both utilities installed, and used them about evenly. I find them less useful now as a developer, but both are worth mentioning here.
A utility I use a great deal as a developer is MZ Tools, written by Microsoft Visual Developer MVP Carlos Quintero. There are free VBA and VB6 versions as well as commercial .Net versions. MZ Tools provides many features that facilitate writing, documenting, and debugging code, and managing VB projects. I don’t even know all the features of MZ Tools; I discover new ones all the time.
For packaging my projects into standard installation executable files, I use Inno Setup by Jordan Russell. This standalone utility uses a versatile script to compile your files into a single EXE file, for simple and flawless installation. Inno places your files into a predetermined or user-defined directory, sets any registry keys, and adds shortcuts during installation, and cleans up after itself when uninstalling programs.
There are some utilities that I don’t use frequently, but I need them on occasion and recommend them to users with specific problems.
FindLinks by Excel MVP Bill Manville helps to find and fix those stubborn links in your workbook.
FlexFind is another handy utility from Jan Karel Pieterse (see above). This utility helps find and replace text throughout the entire workbook, in headers, footers, chart titles, and other places normally out of reach to the Edit menu.
Excel MVP Andy Pope hosts a number of utilities, as well as lots of neat Excel tricks. I use his Button Editor frequently when designing user interfaces.
I may as well mention a few non-Excel utilities I also use frequently.
IrfanView is an outstanding and free image editing utility.
SnagIt by TechSmith is a nicely made program, which can capture images or text from windows or screeen regions. TechSmith also makes Camtasia, which I’ve never used, but which is popular for creating and editing video files. SnagIt and Camtasia are commercial products.
For converting workbooks and other documents to PDF files, there must be dozens of free utilities. I use PDFCreator, which is unique in that it can be controlled using VBA (see Ken Puls’ ExcelGuru site for programming hints), and PrimoPDF, which has a very nice save-as interface.
For FTP file transfers, open source program FileZilla is the best free alternative I’ve come across.
In Using Pivot Table Data for a Chart with a Dual Category Axis, I showed how to make a Chart with a Dual Category Axis using a pivot table to properly arrange the source data. I generally prefer using a regular chart, because pivot charts are pretty inflexible when it comes to formatting. Unfortunately, a regular chart doesn’t know when the pivot table has been resized by a refresh. In Dynamic Chart using Pivot Table and Range Names I showed how to use dynamic ranges to allow a regular chart to update when the pivot table changes. However, that technique only accommodates a change in the number of points in a series, not the number of series in the chart. To allow for changing height and width of a pivot table, you either need to revert to a pivot chart, or use a VBA procedure, which I describe below.
This example begins with the same data range, pivot table, and chart as the previous ones.
I’ve highlighted some sections of the pivot table above to correlate them with pivot table ranges in the VBA object model.
The VBA procedure is not too complicated. It should be run whenever the pivot table has been refreshed. It can be called from a button, or from a Worksheet_Calculate event procedure. Whatever works best in a particular application. Paste this code into a regular code module:
Sub UpdateChartFromPivot() Dim rCategories As Range Dim rValues As Range Dim rSeriesNames As Range Dim pt As PivotTable Dim cht As Chart Dim iSeries As Long Dim nSeries As Long ' Define the Pivot Table 'Set pt = ActiveSheet.PivotTables(1) Set pt = ActiveSheet.PivotTables("PT_ChartSource") ' Define the Ranges Set rValues = pt.DataBodyRange With pt.RowRange Set rCategories = .Offset(1).Resize(.Rows.Count - 1) End With Set rSeriesNames = pt.ColumnRange.Rows(2) ' Define the Chart 'Set cht = ActiveSheet.ChartObjects(1).Chart Set cht = ActiveSheet.ChartObjects("chtPivotData").Chart ' How Many Series? nSeries = rSeriesNames.Columns.Count ' Remove or Add Series until Chart Has Correct Number Select Case cht.SeriesCollection.Count - nSeries Case Is > 0 ' too many: remove excess series For iSeries = cht.SeriesCollection.Count To nSeries + 1 Step -1 cht.SeriesCollection(iSeries).Delete Next Case Is < 0 ' too few: add sufficient series For iSeries = cht.SeriesCollection.Count + 1 To nSeries cht.SeriesCollection.NewSeries Next Case Else ' just right End Select ' Populate Each Series For iSeries = 1 To nSeries With cht.SeriesCollection(iSeries) .Name = rSeriesNames.Columns(iSeries) .Values = rValues.Columns(iSeries) .XValues = rCategories .Border.LineStyle = xlNone End With Next End Sub
Let’s add some rows to the pivot table source data:
Now update the pivot table:
And finally, run the VBA procedure to update the chart:
The VBA procedure can be run from the Macros dialog (press Alt+F8 or navigate the menu/ribbon), from a button you’ve placed on the sheet (see Assign a Macro to a Button or Shape and Assign a Macro to an ActiveX Control), from a menu item (see Assign a Macro to a Toolbar or Menu), or from a worksheet event procedure.
In Excel 2003 (and maybe 2002) you can use the Worksheet_PivotTableUpdate event procedure. Select the sheet with the pivot table and chart, right click on the sheet tab, and choose View Code. This opens the code module for the active worksheet. From the left dropdown in the module window, choose Worksheet, and from the right dropdown, choose PivotTableUpdate. The shell for the event procedure is entered into the module for you. Insert the procedure name as shown below, and you’re ready to go:
Private Sub Worksheet_PivotTableUpdate(ByVal Target As PivotTable) UpdateChartFromPivot End Sub
If you’re using an older version of Excel, there is no PivotTableUpdate event. What I do in this case is place a formula in a cell outside the pivot table (say, M1) with a formula like:
where the range A1:J20 includes the pivot table. This formula ensures that a calculation occurs when the pivot table refreshes. Then I use the Worksheet_Calculate event procedure to kick off the chart update procedure. As above, select the sheet with the pivot table and chart, right click on the sheet tab, and choose View Code. This opens the code module for the active worksheet. From the left dropdown in the module window, choose Worksheet, and from the right dropdown, choose Calculate. The shell for the event procedure is entered into the module for you. Insert the procedure name as shown below, and you’re ready to go:
Private Sub Worksheet_Calculate() UpdateChartFromPivot End Sub
Fourth in a series
A follow-up post shows how to Create and Update a Chart Using Only Part of a Pivot Table’s Data.