Peltier Tech Blog

Excel Chart Add-Ins | Training | Custom Solutions | Charts and Tutorials | PTS Blog

 

Main menu:

 
Peltier Tech
Chart Add-Ins

Peltier Tech Waterfall Chart Utility
Peltier Tech Cluster-Stack Chart Utility
Peltier Tech Box and Whisker Chart Utility
Peltier Tech Marimekko Chart Utility
Peltier Tech Dot Plot Utility
Peltier Tech Cascade Chart Utility

 
Excel Dashboards

Subscribe

Site search

Subscribe

Site search


Recent Posts

Recently Commented

Popular Posts

Archive


 

Categories


 

Privacy Policy

Creative Commons License
Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Deming Regression Utility

by Jon Peltier
Tuesday, October 6th, 2009
Peltier Technical Services, Inc., Copyright © 2010.
Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

I described an alternative linear regression approach in Deming Regression. Whereas the standard linear regression techniques assume that all measurement error in the calculated model is in the dependent variable Y, the Deming Regression technique considers that both X and Y variable measurements may contain errors. Pairs of X-Y coordinates are used to estimate the variances of the X and Y data.

Excel includes linear regression among its built-in functions, but there is no built-in Deming Regression capability. Users who wanted this feature had to purchase third party packages, or perform their calculations in specialized statistics programs outside of Excel.

Deming Regression Utility

I have created a new utility to address this functionality gap. It is a simple Excel add-in, with a simple user interface. The calculations are based on algorithms found in the references at the end of this article.

When installed, the utility has a simple menu item on the PTS Charts menu

PTS Deming Regression Menu

or button on the Add-Ins tab

PTS Deming Regression Menu

The PTS Deming Regression Utility provides two methods for calculating Deming Regression parameters. You can use custom worksheet formulas, or you can use a dialog-driven interface.

Worksheet Formulas for Deming Regression

This option does not require use of the menu or ribbon buttons. The user calculates Deming Regression parameters using simple worksheet formulas.

Referring to the regression results in the previous post

Deming Regression Results

The slope, intercept, and correlation calculations are each the result of a single worksheet formula:

Excel Slope Calculation

Excel Intercept Calculation

Excel Correlation Calculation

The Deming Regression results are returned in a three-element array formula with this syntax:

=DEMING(Y-values, X-values, Horizontal)

Y-values and X-values are required arguments which reference ranges. Each range must be one row or column only, with an even number of cells, and the two ranges must be the same size. The Horizontal argument is an optional True or False indicating the orientation of the output. The default is TRUE, and the results will be output in a row.

Don’t forget, the entire range containing the array formula is selected, the formula is typed, and it is entered by holding down CTRL+SHIFT and pressing ENTER.

Here is the row output form of the analysis:

Deming Array Formula Calculations

Here is the column output form of the analysis; note the FALSE argument in the third position:

Deming Array Formula Calculations

The worksheet formulas are easy to use, but if you send the workbook to a colleague, they also need the utility installed to see the results. Without the utility, all calculations are replaced by #NAME? errors.

Dialog-Driven Deming Regression

The utility features a simple dialog that helps to calculate the Deming Regression results. To use the dialog, click on the Deming Regression item in the menu or ribbon tab, shown above. A simple dialog appears.

Deming Array Formula Calculations

Select the appropriate ranges for the inputs and outputs, press OK, and the program fills in the results.

Calculations made using the dialog are not lost when viewed on a machine that does not have the utility installed.

Download and Install the Utility

Click this PTS Deming Regression Utility link to download the utility. The download is a simple Excel add-in. Save it to any convenient directory, then install it following the instructions in Installing an Excel Add-In or Installing an Add-In in Excel 2007.

References

Chapter 17:  Creating  Custom  Functions, Excel for Chemists: A Comprehensive Guide, 2nd Edition. by E. Joseph Billo, Copyright 2001 by John Wiley & Sons, Inc.

Incorrect Least-Squares Regression Coefficients in Method-Comparison Analysis by P. Joanne Cornbleet and Nathan Gochman, Clin. Chem. 25/3, 432-438 (1979).

Related Posts:

Bookmark and share this entry:
  • Twitter
  • Digg
  • Facebook
  • LinkedIn
  • del.icio.us
  • Technorati
  • StumbleUpon
  • Google Bookmarks
  • Reddit
  • MySpace
  • Slashdot
  • Yahoo! Buzz
  • Tumblr

Learn how to create Excel dashboards.

Comments


Comment from DaleW
Time: Tuesday, October 6, 2009, 12:49 pm

Hi Jon,

If I follow correctly, your Deming regression utility is designed for analyzing pairs of data where each point is immediately replicated, hence the requirement that both X and Y ranges have an even number of points?

Those replicates allow the algorithm to have a good estimate of the variance ratio for the errors (simple measurement repeatability, ignoring any bias to the nominal value). This Y to X variance ratio — that is assumed infinite for ordinary least squares where all the assumed error is Y — seems to be about 0.306 in your example dataset. (This might be another useful output for your utility?)

If I round to four decimal places, your Deming regression slope was given as 0.9940, your Demining utility gave 0.9942 for the rounded data on your screenshot, and I got an estimate of 0.9939 inputting my quick estimate of the variance ratio and your screenshot data into a commercial statistics package that supports Orthogonal Regression. The statistical uncertainty at 95% confidence level for the orthogonal slope is a couple orders of magnitude greater than these order 0.0001 differences, so I’m going to say our results agree.

Probably a better test case would have lower correlation between the two variables, so some meaningful differences might emerge by comparison with Ordinary Least Squares (both Y v. X and X v. Y).


Comment from Jon Peltier
Time: Tuesday, October 6, 2009, 2:01 pm

Dale -

When I “made up” the data, I used Excel’s NORMINV to compute the errors, and I think I used sigma values that differed by about 3:1, with higher variance in X. I was playing around to find points that looked good on screen.

Do you have any real world data that would satisfy your last sentence? It would be interesting. I presume you want more deviation than in my last chart in my first post?


Comment from DaleW
Time: Tuesday, October 6, 2009, 9:54 pm

Jon -

Unfortunately, the example from your prior post on Deming regression doesn’t show us that Deming regression will give a noticeably different result than ordinary least squares (OLS) regression.

The slope (Y to X) of a best fit line is about 0.993 by OLS and 0.994 by reverse OLS, with a standard error of 0.014 for either, so we can’t say whether the true slope is slightly more or less than one. Orthogonal regression can try to more precisely place the line between those limiting cases (no error in X or no error in Y), but it is such a small tweak here that it makes no practical difference.

To make a real difference, r-squared, which also equals the ratio between those two OLS slopes(!), generally has to be a lot less than 100%. It was 99.84% for your example. On the other hand, for a good measurement system, the vast majority of the total variance is explained by the sample differences, not the machine differences, so r-squared between paired machine measurements across the same samples tends to be very high.

Although I would prefer not to locate an example in my files of a bad enough measurement system with replicated measurements so that Deming regression could have been helpful, I’ll have to check . . .


Comment from Jon Peltier
Time: Tuesday, October 6, 2009, 11:41 pm

Dale -

Admittedly, my machine example was a hack. Obviously when the two machines agree anywhere near as close as this, there is no practical difference between them. Your suggestion for an unpaired analysis is a good one. I’ll check about altering the algorithm.


Comment from Matt Healy
Time: Thursday, October 8, 2009, 10:30 pm

To follow up on previous comments, most variants of linear regression have the line pass through the point (mean_x, mean_y), so the difference is in the slope. If we call m1 the slope from regression of Y against X, and m2 the slope from regression of X against Y (swapping variables), then m1 times m2 will equal r-squared (try it yourself!). Of course r-squared will be the same either way. If one assumes the error in each variable is proportional to its value, then the best estimate for the slope will be simply the ratio of the standard deviations of X and Y.

Of course robust regression techniques use shrinkage estimators, order statistics, and the like to reduce the impact of outliers on the slope, which is another whole ball game. OLS and its variants assume an underlying Gaussian distribution but if the data come from a heavy-tailed distribution then OLS is too sensitive to outliers.


Comment from DaleW
Time: Friday, October 9, 2009, 10:31 am

To follow up on Matt’s comments: as an extension of OLS, Deming regression fits an XY slope m through the center of one’s data somewhere between m1 by OLS and 1/m2 by reverse OLS, depending on one’s estimate of the ratio of residual variances in the two variables.

Jon’s utility currently covers a very special case where the experimenter replicated every measurement to allow a good estimate of this Y:X variance ratio.

At the other extreme, in the absence of information, a data miner might guess that the error in each variable is proportional to its scatter (standard deviation), and Deming regression splits the difference, bisecting our data ellipse along its primary axis, with a slope that simplifies to sqrt(m1/m2).

Jon, the VBA code that you cited by E. Joseph Billo for that special case could be very easily modified to use a user selected estimate of the variance ratio “lambda” for the general XY regression problem, wouldn’t you say? Add radio buttons to the dialog for three or four limiting special cases, and you’d have a very cool Excel utility for general Deming regression (linear orthogonal regression for two variables).


Comment from Jon Peltier
Time: Friday, October 9, 2009, 11:22 am

Dale – You’ve been working on me for several days now, and my resistance is crumbling. I’ll start planning it soon.


Comment from Mike
Time: Thursday, November 5, 2009, 11:04 pm

Hi

When I use PTS Deming Regression Utility in Excel 2007 I get the following message:

“Run-time error ‘13′: Type mismatch”

Can you help me sort this out to get the utility working for me?

Thanks

Write a comment

I welcome comments from my readers. If you have an opinion on this post, if you have a question or if there is anything to add, I want to hear from you. Whether you agree or disagree, please join the discussion.

If you want to include an image in your comment, post it on your own site or on one of the many free image sharing sites, and include a link in your comment. I'll download your image and insert the necessary html to display the image inline.

Read the PTS Blog Comment Policy.





Subscribe without commenting

Peltier Tech Waterfall Chart Utility Peltier Tech Cluster-Stack Chart Utility Peltier Tech Box and Whisker Chart Utility Peltier Tech Marimekko Chart Utility Peltier Tech Dot Plot Utility Peltier Tech Cascade Chart Utility

Create Excel dashboards quickly with Plug-N-Play reports.