Excel 2007 Regression Error – Fixed in SP1
by Jon Peltier
Wednesday, May 21st, 2008
Peltier Technical Services, Inc., Copyright © 2010.
Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
In Calculation Bug Fixed, an old but ongoing thread in the Daily Dose of Excel blog, I mentioned a problem with Excel 2007’s trendline regression formulas. Apparently small regression coefficients were treated like small errors that occur in the insignificant digits when converting from binary to decimal. I had documented the error in May of 2007, and while following up in Daily Dose, I tested in Excel 2007 SP1, and the error has been corrected. I’m here not to raise unnecessary alarms among users who have already updated to SP1, but to report correction of this problem, and to describe the errors for those who have still not updated.
The problem is illustrated with datasets having small Y values. Here is a table that uses the same X values and Y values that differ by their power of ten:
| X | Y1 | Y2 | Y3 | Y4 | Y5 | Y6 | Y7 | Y8 |
| 0.1111 | 2.872E-19 | 2.872E-18 | 2.872E-17 | 2.872E-16 | 2.872E-15 | 2.872E-14 | 2.872E-13 | 2.872E-12 |
| 0.0625 | 4.043E-19 | 4.043E-18 | 4.043E-17 | 4.043E-16 | 4.043E-15 | 4.043E-14 | 4.043E-13 | 4.043E-12 |
| 0.0400 | 4.576E-19 | 4.576E-18 | 4.576E-17 | 4.576E-16 | 4.576E-15 | 4.576E-14 | 4.576E-13 | 4.576E-12 |
| 0.0277 | 4.814E-19 | 4.814E-18 | 4.814E-17 | 4.814E-16 | 4.814E-15 | 4.814E-14 | 4.814E-13 | 4.814E-12 |
The data is charted below, on a semilog plot to get all curves in the plot on the same basis, with trendline formulas next to each curve.

The regression coefficients are reproduced below.
| Magnitude | Intercept | Slope | Correlation (R²) |
| E-12 | 5.49429E-12 | -2.35066E-11 | 0.999213 |
| E-13 | 5.49429E-13 | -2.35066E-12 | 0.999213 |
| E-14 | 5.49429E-14 | -2.35066E-13 | 0.999213 |
| E-15 | -2.35066E-14 | 0.999213 | |
| E-16 | 5.49429E-16 | 0.999213 | |
| E-17 | 5.49429E-17 | 0.999213 | |
| E-18 | 5.49429E-18 | 0.999213 | |
| E-19 | 5.49429E-19 | 0.999213 |
All of the R² values are the same, and the slope and intercept values differ only by their power of ten, where they appear in the formula. Apparently the algorithm that removes binary to decimal conversion errors overzealously removes some of the regression coefficients in data sets with small values.
The coefficients calculated using SLOPE(), INTERCEPT(), CORREL(), and LINEST() are not affected by this overcorrection.
Related Posts:
- Graphical Approach to a Simple Physics Problem
- Deming Regression Utility
- Regression Approach to a Simple Physics Problem
- Deming Regression
- Trendline Fitting Errors
- Two Color XY-Area Combo Chart – Guest Post
- Polynomial Fit vs. Statistical Process Control
- Goal Seek – Optimization Approach to a Simple Physics Problem
- Choosing a Trendline Type
- Plot Two Time Series And Trendlines With Different Dates
Posted: Wednesday, May 21st, 2008 under Excel 2007.
Comments: 1
Comments
Comment from Patrick
Time: Friday, January 30, 2009, 3:21 am
I’m very glad you posted this…I was beginning to think I was the only person with this problem. I triple-checked my numbers, and they just weren’t making sense. Well-documented. Let’s see if SP1 does it.



















Write a comment
I welcome comments from my readers. If you have an opinion on this post, if you have a question or if there is anything to add, I want to hear from you. Whether you agree or disagree, please join the discussion.
If you want to include an image in your comment, post it on your own site or on one of the many free image sharing sites, and include a link in your comment. I'll download your image and insert the necessary html to display the image inline.
Read the PTS Blog Comment Policy.