Excel Interpolation Formulas

In Getting value on Y axis by putting X axis value on the Mr Excel forum, someone wanted to know how to find in-between values of a function, given some known data points. The approach, of course, is to interpolate values given the known points on either side of the value you need.

Interpolation requires some simple algebra. The diagram below shows two points (blue diamonds connected by a blue line) with coordinates (X1, Y1) and (X2, Y2). We need to find the value of Y corresponding to a given X, represented by the red square at (X, Y).

The smaller triangle with hypotenuse (X1, Y1)-(X, Y) is “similar” to the larger triangle with hypotenuse (X1, Y1)-(X2, Y2), so the sides of the triangles are proportionally sized, leading to the first equation below the sketch. We rearrange this to solve for Y, in the second equation.

Algebraic construction for interpolation

We’ll set up our interpolation in the example below. Our data is in A5:B18, and the known values are plotted as blue diamonds connected by blue lines in the chart.

The analysis has two parts: first we need to determine which pair of points to interpolate between, second we need to do the interpolation. We will judge the validity of our interpolation by plotting the calculated point on the same chart.

Interpolating set-up in worksheet

Solving for Y

I’ve put the calculations above the data table. The yellow shaded cell, A2, holds the known X value, and a formula in cell B2 holds the calculated Y value. Cell A3 indicates which pair of points to interpolate between. The formulas are:

A3: =MATCH(A2,A6:A18)

B2: =INDEX(B6:B18,A3) + (A2-INDEX(A6:A18,A3)) * (INDEX(B6:B18,A3+1)-INDEX(B6:B18,A3)) / (INDEX(A6:A18,A3+1)-INDEX(A6:A18,A3))

We want the Gauge value (Y) when the Flow value (X) equals 3, so this is entered into the yellow shaded cell A2. The formula in A3 tells us that our computed point is between the 7th and 8th data point, and the formula in B2 calculate Y=0.444, and the calculated point (A2, B2) is the red square that lies along the plotted data points. Looks good.

Interpolating to solve for Y - example 1

Te determine the Gauge value for a Flow value of 12, we enter this into A2. The red square moves along the blue line past the 11th point, where Gauge=0.548.

Interpolating to solve for Y - example 2

Solving for X

With a minor rearrangement, we can instead solve for Flow, given a value for Gauge. The known Gauge value is entered into B2, shaded yellow. B3 indicates the pair of points to interpolate between, and A2 provides the value for Flow. The formulas are:

B3: =MATCH(B2,B6:B18)

A2: =INDEX(A6:A18,B3) + (B2-INDEX(B6:B18,B3)) * (INDEX(A6:A18,B3+1)-INDEX(A6:A18,B3)) / (INDEX(B6:B18,B3+1)-INDEX(B6:B18,B3))

For a Gauge (Y) of 0.53, we compute a Flow (X) of 9.790. The red square shows where the calculate point lies along our plotted data.

Interpolating to solve for X - example 1

Changing the Gauge value to 0.35 moves the red square way to the left, to a Flow value of 0.400.

Interpolating to solve for X - example 2

Peltier Tech Chart Utility

Comments

  1. As they say on quiz shows, these things are only easy if you know the answer.
    All that’s missing now is an extra series plotting the horizontal and vertical lines to the axis’s. If you then add the label to the y-intercept points and fix the formatting, you can have a nice box where the user enters a number and the graph shows/tells the y-value

  2. Calvin -

    All you need to do is put error bars, with percentage of 100 in negative X and Y directions, to reach the axes. Then add a data label.

    Interpolating with droplines and labels

  3. What’s the difference between your interpolation formula and FORECAST. Using FORECAST, I get slightly different numbers, but I’m not sure why.

    =FORECAST(A2,OFFSET(INDEX(B6:B18,MATCH(A2,A6:A18)),0,0,2,1),OFFSET(INDEX(A6:A18,MATCH(A2,A6:A18)),0,0,2,1))

  4. Dick -

    I copied your formula exactly and pasted it into my sheet, then tried a bunch of X values. In all cases, the Y values were identical, whether I used your formula or mine. Of course, this is in Excel 2003, back when Excel was taught math.

  5. Is it possible to pull the equation from Trendlines out and use it to extrapolate? So if the best r squared is a polynomial function we could use the trendline equation to extrapolate instead of doing the linear assumption between two points?

    Just wondering…

  6. Oh, that makes me feel better. I hand typed your values and I’ll be there was some precision that was hidden.

  7. Dick – Guess you’re not a very accurate typist.

  8. Eric -

    You can use LINEST to compute the coefficients in the worksheet. See Bernard Liengme’s tutorial: Polynomial regression: How can I fit my X, Y data to a polynomial using LINEST?.

    That said, you shouldn’t just use R² blindly to decide which fit is best. Beyond order 2, a polynomial fit may have a great R², but it also might deviate wildly from the smooth curve you might draw by hand. I’ve written about these issues in Choosing a Trendline Type, Trendline Fitting Errors, and Polynomial Fit vs. Statistical Process Control.

  9. Well, yes, of course. I guess my point was badly made and my real question badly asked.

    If the data turns out to be just badly sampled, say it was some high frequency sine wave and you happen to sample at the right points, your data could look like a flat line would be the best fit line. So perhaps a linear estimation of the point you want to estimate, given the poorly sampled data, would not be the best way of going at it. 99 times out of a 100, though, it probably will.

    The nice thing about trendlines is you can see how good your fit is to the data fairly quickly. We could use all sorts of ways to more accurately figure out the function that the data actually seems to conform to, but Trendlines seems to do this quickly if not precisely.

    Is there some Trendline object that we can get at to get that nice little formula that you can have put onto your chart when you use Trendlines? I since have played with it and found that there is… ActiveChart.SeriesCollection(1).Trendlines(1).DataLabel.Text is what I was looking for. As it is text, you’d have to parse it and so it might be easier to do the other methods you mentioned, but it might be doable.

    Anyways, that’s what I was badly referring to.

  10. why not use trend?
    I use it excessively. use index(match()):index(match()+1) for the X and Y range and you’re done.

    the error bars are a new one – I usually set up a start and end point, but this can be a bother when your axis aren’t at zero.

  11. after posting the trend I wondered:

    I’ve spend many hours trying to find a nice, concise way to do quadratic or cubic interpolation.
    Of course it’s possible to use linest() to find the formula variables… but is there another way?

    btw, another nice feature of the trend function is that is allowas you to do multiple variable linear interpolation.
    When you have a graph with multiple lines for different conditions (say your flow/guage ratio is dependent on temperature. You can find the four points bracketing your search condition using the match() function, and then input a 2×2 matrix as you input X, and a 1×2 as your target X and watch the magic happen!

  12. Using minus error bars at 100% to show the intercept with the axis? That’s awesome, never thought of that!

  13. Ikkeman – Below are some Excel formulas for various types of fit (see Wikipedia interpolation article for more details). These return Y for a given X value, to find X given Y we can just swap the A’s and B’s in the formulas.

    As Jon has said before, despite their popularity polynomial trends are a poor choice for fitting many types of data. The other methods are quite close for the given data as the direction of the curve is not changing much around the estimated points.

    Interp Trend Linear Cubic Spline
    9.789877 0.53174 0.53000 0.53074 0.53025
    0.400368 0.34602 0.35000 0.35336 0.35096

    Trend – Cubic polynomial (equivalent to polynomial trend line order 3):
    =TREND(B6:B18,A6:A18^{1,2,3},A2^{1,2,3})

    Linear – For increasing data (or for decreasing put 1- after first comma):
    =PERCENTILE(B6:B18,PERCENTRANK(A6:A18,A2,20))

    Legendre – A cubic fit based on the surrounding 4 points:*
    =TREND(OFFSET(B6,MATCH(A2,A6:A18)-2,,4),OFFSET(A6,MATCH(A2,A6:A18)-2,,4)^{1,2,3},A2^{1,2,3})

    Spline – Catmull-Rom (equivalent to “smooth line” option):*
    =SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},OFFSET(A6,MATCH(A2,A6:A18)-2,,4)-A2
    )))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},OFFSET(B6,MATCH(A2,A6:A18)-2,,4)))/2

    * for estimation in end intervals you can use extrapolated points at each end:
    A5:B5 = {-0.10022,0.29} A19:B19 = {21.82408,0.616791}

  14. Ihem – thanks for sharing that compact matrix formula for one type of cubic spline. That may be the most impressive non-array formula that I’ve ever seen in Excel. I’d never thought of using the Internal Rate of Return function for fitting non-financial data.

    If anyone ever wants a smoother interpolation than piecewise linear, cubic splines are usually the way to go. Fortunately there are good free add-ins that make cubic splines easy to calculate in Excel.

  15. DaleW – Happy to share, took a while to figure out! It’s a little known fact that you can convert nearly any array formula to a non-array formula with MMULT, eg
    Array =SUM(IF(A1:A5>2,B1:B5)) Non-Array =SUM(IF(MMULT(N(A1:A5>2),1),B1:B5))

    The straight/smooth chart line options have the property that they maintain the same shape if the configuration of points is rotated. This can be seen in the formulas since X on Y followed by Y on X returns the original result with the linear/spline options, but not for trend or Lagrange polynomials (not Legendre, always get those guys mixed up!) VBA code for chart data is here

    Splines have the intuitive appeal that they model the shape taken up by a smooth thin flexible strip passing through the set of points. In the linear case, the strip is pulled taught whereas in the cubic case there is no tension at the endpoints.

  16. Ihem – Interesting, I’d never seen MMULT() used to replace an array formula. In real life, we’d both use =SUMIF() for your example to have error checking, as otherwise a string entry in column A gives what we’d typically consider a bug. MMULT does duplicate even such … features.

    Natural cubic splines or Catmull-Rom cubic splines for exactly 4 points would follow your zero tension at the endpoints analogy. Other interpolating cubic splines have tension. Cardinal splines use piecewise cubic polynomials with a tension parameter ranging from 0 (for Catmull-Rom) to 1 (where they collapse to piecewise linear interpolation). Thus Cardinal splines can replicate straight/smooth lines, or anywhere in between.

  17. DaleW – You make excellent points. For the MMULT method, a better example would be to apply to: =MIN(IF(A1:A5=”a”,B1:B5)). I don’t know of another way to make this type of formula non-array entry, even using INDEX(A1:A5,0) fails.

  18. I find array formulae should be avoided ilke the plague if there’s even a remote possibility anyone else will be using the spreadsheet because you’ll end up tech-supporting it when they hit enter on a cell. Either redesign with extra columns if possible and if not then write a user defined function in one of the modules.

  19. what about if you have columns and rows. for example 6,8,10( in the column) and 8,10,12 (row). and under those is an array of set of values. like if i want to canculate

    8 and 10.4

  20. Luu -

    If the two values are between known values (say, 8.4 and 10.4), you have to interpolate first to get values at 8.4 for 10 and for 12 (two interpolations), then interpolate again between these results to get a value at 10.4.

  21. Hi,

    I’m no mathematician but i have tried my best to follow your logic………..

    I was wondering if there was a way to Interpolate a point on a line between 2 known points. Not knowing the X or Y value? Example of data:

    X Y
    1 6.3 2.12
    2 7.6 2.19
    3 8.6 2.19
    4 9.6 2.16

    On a Scatter with Smooth Lines & Markers graph the peak of the line will be between point 2 & 3. Is there a way to Interpolate that point? (manually graphing it would give me an interpolation point of X=8.0 & Y=2.20) (X rounded to 0.5)

    I appreciate your assistance in this matter!

    Daniel

  22. Daniel -

    You want to find where the smoothed drawn line has its maximum? I usually advise people not to use smoothed lines. They give a false indication of where data may go. They are not based on any physical model of the data, but only on artistically connecting all points in a pretty way, using curve-drawing parameters that “looked nice” to the same kind of designer that gave us Excel’s early color palette. Smoothed lines ignore any measurement error in the data which is actually present, they ignore all points in the data set other than the two being connected and the point on either side, and if markers are not displayed the viewer has no idea where the actual data lies.

    That said, for symmetric points, the usual approach is to fit a quadratic curve (2d order polynomial) to the points, and calculate a maximum using the fitted coefficients. A quadratic fit is probably reasonable, since physical processes often can be modeled by linear and quadratic relationships.

    Your points are not symmetric, so the fitted parabola does not pass through all the points; instead it lies in such a way as to minimize the total deviation of the parabola from the data points. You could fit a third order polynomial to your data and calculate your maximum, but this starts to get into the realm of overfitting.

  23. Daniel,

    Realistically, close to X=8 is your best guess for the maximum.

    If you had really accurate measurements, an approach known as cubic splines would be great for fitting a smooth curve exactly to your data. In fact, the smooth lines that Excel charts draw are one type of cubic spline.

    With four points, Excel can fit a cubic (3rd order) polynomial smoothly and exactly through the points. If you play with that equation, you’ll see it predicts the maximum at just over 8.08, and some people would consider that the best answer. (It’s really close to the natural cubic spline solution, also.)

    Jon’s right that anything past a linear fit without a physical model or lots of supporting data is risky or wishful thinking. But if we are convinced that the data is highly accurate and don’t have a model, cubic splines are commonly used to create the smooth interpolating curve. In Excel, if we have two nearest neighbor points on each side of the region of interest, a cubic polynomial tends to be a very close approximation to the cubic spline fit there.

  24. Hi Jon

    I have you example above to solve for X using data below and it does not work. I followed your example and the subtituted with my data and it gives me an error message on A2. Any idea why?

    X Y
    0.12 3
    0.15 2.67
    0.18 2.45
    0.22 2.24
    0.28 2.02
    0.3 1.94
    0.35 1.82
    0.38 1.74
    0.46 1.6
    0.54 1.48
    0.6 1.42
    0.68 1.34
    0.78 1.26
    0.9 1.19
    1.05 1.12
    1.2 1.06
    1.4 1

  25. Tshepo -

    Did you change the formulas to include your whole data range (mine stopped at row 18, yours continues to row 22)?

    Cell A2 is where you enter an X value for which you want to calculate a Y value. It should not contain an error. What X value were you entering?

    Was the error in cell A3? Was it because the requested X value was outside of the range of your X data?

    A more effective approach for your data might be a power law fit: such a fit with your data has an R² value of 0.9991.
    Power Law Fitted Data

  26. Tshepo Phutha says:

    Hi Jon
    I did expand data to row 22 and I am trying to solve X not Y. I get an error on B3 and A2 but when I use data from your example it work fine.

  27. Tshepo -

    So:
    B2 has the known Y
    B3 =MATCH(B2,B6:B22)
    A2 =INDEX(A6:A22,B3) + (B2-INDEX(B6:B22,B3)) * (INDEX(A6:A22,B3+1)-INDEX(A6:A22,B3)) / (INDEX(B6:B22,B3+1)-INDEX(B6:B22,B3))

    But my formula assumed the range being matched was monotonically increasing, which I never stated. If I include the default value of the optional third parameter, the formula is
    =MATCH(B2,B6:B22,1)
    which tells Excel the data is increasing and to return the index the closest value greater than the sought value. Your formula needs to be
    =MATCH(B2,B6:B22,-1)
    which tells Excel the data is decreasing and to return the index the closest value less than the sought value.

    I still suggest using a power law fit.

  28. Tshepo Phutha says:

    Jon

    Thank you very much for help. I am working on learning on how power fit works.

    Thank you very much

  29. Hi Jon,

    I’m also finding a very strange error using the spline interpolation formula. I am interpolating the following set of data:
    x y
    1.798179868 165.3955251
    1.804183302 165.0206621
    2.230427067 139.8702098
    2.428540367 113.9704961
    2.5966365 65.08009679
    2.752725766 21.02938029
    2.842777266 25.76285129
    2.932828766 20.8522448
    3.022880265 47.45262432
    3.148952365 30.61012109
    3.275024465 26.35386096
    3.443120598 15.5813369
    3.869364363 -31.93964513
    4.031457063 -44.2596589
    4.133515429 -59.06686391
    4.235573796 -77.81866744
    4.313618429 -109.5812402
    4.391663062 -124.4431029
    4.469707695 -185.7945248
    4.547752328 -209.4166568
    4.601783228 -224.5765144
    4.667820995 -267.6385983
    4.733858761 -341.6070167
    4.787889661 -402.0504547
    4.841920561 -424.5273038
    4.895951461 -484.5497172
    4.949982361 -513.475427
    5.376226126 -645.3444851
    5.394236426 -648.4536159
    5.40023986 -650.1201883

    When I interpolate, I find that the following x values generate a #NUM! error:
    4.901954894
    4.901954894
    4.901954894
    All the other values I am trying (less than and/or greater than these, and one in between the first two) seem to interpolate fine.

    The x values at which I am trying to interpolate are part of series of data. I am doing a calculation for some y values, and then interpolating the remaining x values. When I make small changes to the x value range, or use a different number of calculated points, there is often no problem at all. Like the example above, the y values follow a generally decreasing trend, but this is not always the case and y may go up and down. The x values are always monotonically increasing.

    Let me know if you want any additional info. Thanks for any help you can provide in diagnosing this. It is a very vexing problem!

    -Charlie

  30. I forgot to mention above that, when I switched to the “Legendre” interpolation formula, THE EXACT SAME POINTS produced the same #NUM! error.

    Maybe this is a hint about the source of the problem…

    -Charlie

  31. Ooops, a correction… the “Legendre” formula seems to work just fine. I discovered an error that I had made when entering it. The Spline still does not work, however.

    I have created a spreadsheet with the data mentioned above, and a correction to the “problem” data points to help diagnose the problem. It contains the SPLINE and LEGENDRE interpolation formulas, and shows that the SPLINE fails when one of the problem x values is entered.

    You can download it at:
    http://claub.net/temp/Spline_Error_Test.xls

    Thanks for any help!!!!

    -Charlie

  32. Hi CharlieL,

    Alas, this particular formula for this particular cubic spline indeed fails for part of your dataset.

    That segment of your Catmull-Rom spline happens not to be monotonic for predicting Y from X. This causes the IRR() formula to fail rather interestingly in at least 5 subintervals of that segment’s range in X, for X values that are monovalent instead of trivalent!?

    While I was able to calculate and chart the real Catmull-Rom spline directly from its defnition as a t-function-weighted sum of the two nearest points in each direction in X, you possibly don’t want a spline that behaves this way to fit your dataset. Perhaps a natural cubic spline or a monotonically constrained cubic hermite spline would be a better choice? The Newton Excel Bach blog has done a wonderful job setting up VBA code to let Excel users explore a variety of different cubic splines.

  33. PS — Far worse than not being monotonic, it’s not monovalent. This type of spline can reverse direction in a way that gives multiple Y values for the same X value, and the IRR formula doesn’t work in some subsections of segments with that behavior. Wrong spline if you need a unique Y=f(X) fit.

  34. Hi DaleW,

    Thanks for posting an explanation of the problem. I do need a worksheet-formula-only way to do the interpolation (e.g. no VBA or macros), and I would like a smooth curve a la spline or similar. For now I have switched to the Legendre formula, and that seems to work OK for the type of data I am generating. I am still learning about how to implement matrix multiplication via the MMULT function, and I am not up to coming up with something new myself at this point, although I would like to learn how to do it. Also, it’s not clear to me whether the problem that I described in my post was due to a limitation of the Catmull-Rom spline itself, or the implementation via the formula. Can you please elaborate on this?

    I need the interpolated line to go through the points in the data set. You mentioned that this is the “wrong” spline – how about a B-Spline? For instance, the Uniform Cubic B-Spline described here:
    http://en.wikipedia.org/wiki/B-spline#Uniform_cubic_B-splines
    has an explicit matrix form that seems to lend itself well to this approach (from what I can tell). Can you (or someone else reading this post) help me implement it using IIR and MMULT?

    Thanks!

    -Charlie

  35. Charlie,

    B-Splines don’t go through their control points, which is inconvenient when you want to interpolate actual data.

    Catmull-Rom splines do interpolate their control points, but they aren’t constrained by the concept that there should be one and only one Y value for each X value. I believe some IRR() problems have more than one rate of return that solves the constraints, but a “well-defined” mathematical function can only return one value for a given value of X. Thus IRR() fails to capture parts of some Catmull-Rom spline segments, which a weighted sum of the control points in terms of a t parameter going from 0 to 1 in each segment does reveal.

    Assuming you’d like a “well-defined” and smooth continuous interpolating function for Y on X, I would recommend the classic solution of natural cubic splines. Usually this is implemented in programming languages, and I’ve got more than one VBA solution squirreled away. I don’t imagine that solving the sparse symmetric matrix equations for under 50 points would be that hard to program using the cells of a spreadsheet, if one were so inclined.

    The Legendre formula is the cubic polynomial solution, or the simplest natural cubic spline case. The Legendre formula will not give you continuous derivatives if you use it for more than 4 total points. In one mathematical sense, natural cubic splines offer the smoothest possible interpolation. Natural cubic splines aren’t likely to be used for graphics because it isn’t a local solution like Legendre or Catmull-Rom, and every point instead of just 4 local points can have some impact every segment of the spline, making it more computationally intensive.

  36. Hi DaleW,

    Now that I checked into it further, you are correct about B-splines not passing though each data point. A cubic Spline would be the best but my data set can be up to 600 points, and doing the large matrix for that would be too much overhead. The Lagrange interpolation seems to be “good enough” for me, despite the occasional cusp in the interpolation where there is a derivative discontinuity.

    -Charlie

  37. Charlie,

    Good enough!

    Incidentally, your dataset can be used to show that Excel must modify the Catmull-Rom spline algorithm to draw its smoothed lines, because Excel smoothed lines usually overlap that spline yet avoid the pathology that you found. As Answers.Microsoft notes, “An adjustment to the basic method is applied if consecutive points are a factor of three or more apart so as to reduce the amount by which the curve overshoots points.” I interpret this as a sort of false position method, where any outer control point that further that 3x the distance between the inner control points is pulled closer. Thus on your troublesome (X,Y) interval between (4.895951461, -484.5497172) and (4.949982361, -513.475427), the next point is treated as if it were closer at (5.2305,-600.2523) for the purposes of the Catmull-Rom algorithm on that segment only, which removes the strange behavior and restores single-valued functionality and lets the IRR() formula work, as well as duplicates the Excel smoothed line exactly on my plot. This might be more trouble than it is worth for you to duplicate Excel’s native interpolating spline — but it would be a lot easier than solving the tridiagonal matrix equation for natural cubic splines. You’d just need to create a table of the 4 XY control points for each of N-3 inner segments where you say your full table is N=600, and the Catmull-Rom OFFSET logic would have to point to that table. This would provide C1 continuity (of 1st derivatives), but not the C2 continuity (of 2nd derivatives also) that is provided by natural cubic splines. Your Legendre solution only has C0 continuity (of the function itself) at the data points (but C2 continuity elsewhere like all cubic splines within each segment, not necessarily between segments).

    BTW, are you sure you want spline interpolation instead of LOESS smoothing?

  38. Dale -

    Brian Murphy of UT Austin worked out bezier algorithms that seem to have been used for smoothing series lines in Excel charts. I think the details can be found at xlrotor.com.

  39. Jon – yes, I saw that. I believe that the Bezier algorithms may be very similar, but MS Excel actually uses modified Catmull-Rom splines, per the July 4, 2011 post by Answers.Microsoft in response to Tushar Mehta. The VBA code is there, but I think that I did duplicate quickly the Excel charting smooth curve for this troublesome case by using a false position for a far outer control point as I tried to describe using both Catmull-Rom formulas (the quick IRR() one discussed on your blog, and the formal definition in terms of weighted sums of the 4 basis points using t-parameter polynomials).

  40. Charlie (and Jon),

    Those Bezier algorithms match Excel’s smoothing to within a few pixels for the troublesome segment. They are in VBA and parametric, so it might take quite a bit of work to implement them to evaluate a particular X value without VBA.

    Without using VBA, I picked up on your idea and implemented Ihem’s IRR() and MMULT() formula for Catmull Rom with more or less the modification suggested by Lori Miller for highly unevenly spaced data, and it looks good to within a pixel or two compared to Excel’s native curve fitting, even for your segment where simple Catmull Rom misbehaved. The key formula to find the adjusted spline Y for any X had to be switched to an array formula to work for me:

    {=SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},(TRANSPOSE(OFFSET(xQuad,$Q5-1,0,1,4))-$O5))))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},TRANSPOSE(OFFSET(yQuad,$Q5-1,0,1,4))))/2}

    The transpose was necessary with matrix logic to accommodate a table (starting at xQuad and then yQuad) of possibly linear-interpolation adjusted outer influence points created by IF logic, with 4 XY points defined in 8 columns shortly after each inner interval as defined by the original XY data columns in A & B. None of this extra effort is needed if the X data is roughly evenly spaced, but your data wasn’t. The Q column holds the Index from Match(), again with slight modification for the endpoints. The O column held the arbitrary X points to evaluate.

  41. CharleyL – Interesting example and glad the suggested formulas have been useful to you. Strangely, in my Excel 2010 version the spline formula does actually return values for the cases highlighted but further investigation reveals this is due to improvements in the IRR implementation in this version. I find the formula matches the curve for the most part but there are a few spurious values where it is picking up another root. An alternative is to use the more natural version of the formula with 1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1} replaced by IRR(MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1} which does return the correct values for your example. I seem to remember the point of using 1/IRR of the reversed matrix was related to avoiding 0^0 errors when the formula was evaluated exactly at a data point but these can be easily dealt with by adding a negligible constant like 1E-99. Note also the formula is for a 2D spline but could also be modified to a 1D version which would ensure single-valued results. In many cases where data is fairly evenly spaced there won’t be a large difference between these two methods.

    DaleW – Nice analysis. The modification that the charting algorithm makes can be seen in various ways. Suppose there are four consecutive points A,B,C,D and you want to construct the curve segment between B and C. In terms of Hermite splines the tension parameter implemented by the chart algorithm is chosen to be t=max(3,|AC|/|BC| ,|BD|/|BC|)/6 (based on this definition of tension and where |.| denotes distance between points as measured on the chart ). In terms of Bezier curves this condition enforces that the control points are never more than a distance |BC|/2 from B or C so do not cross each other. For most normal cases where points are fairly evenly spaced this reduces to the Catmull-Rom case (t=1/2) . Doug Jenkins has many more useful posts and VBA functions on these topics which are well worth looking at as you say – there are a few formulas and comments related to the connection with natural splines are here.

    Jon – The xlrotor code is normally fairly close but can deviate quite substantially in some cases. For example entering these (x,y) values: (9,8) (10,12) (11,12) (14,6) in the Bezier example file link gives clearly incorrect values for 10 < x < 11. The vba code posted at answers.microsoft may be the first accurate algorithm available publicly.

  42. Hi Jon Peltier et al.-
    I’m not a math expert. But after trying to fit your equations for the last 2 hrs, finally I gave up! And really don’t understand why the interpolation is not working for my database (excel file link below)

    http://www.mediafire.com/view/?hdpdpu641qdpxak

    I would be very glad if I find any solution. Thanks- Rajib

  43. There was no data in the posted worksheet, so I couldn’t interpolate values either.

  44. Sorry Jon Peltier, There was something wrong. Please find the data in excel as well as in a tab delimited .txt file below:

    http://www.mediafire.com/view/?1bo9he3c3jz05u7
    http://www.mediafire.com/view/?tmyuta23ajjj27i

    Best/
    Rajib

  45. The simple linear interpolation methodology worked for me. I used the equations to linearly interpolate y values for a column of x values.
    Many Thanks

  46. I’ve been trying to get this to work with a very small set of yearly data so:
    X = year in the format 2001
    Y = a price
    I have this data for 12 years
    i want to estimate a monthly value in any of those twelve years.
    However, i’m struggling to do this with your spreadsheet.
    For example i tried 2001.083 which could predicate a month
    however i get a ref error in A2 and i dont think this is the best way of doing it.

    Ultimately i want to try and estimate each monthly figure based on a set of yearly final figure. So the value of months 1 through 12 should equal the current yearly figure.
    Hope this makes sens.e

  47. Cell A2 should have no error, because it’s the cell you enter the X value for which you want to determine a Y value. So with 2001.083 in A2 you should get a value that is 1/12 of the way from the 2001 value to the 2002 value in cell B2.

    Also, this sentence doesn’t make sense if you’re trying to estimate monthly values:
    “So the value of months 1 through 12 should equal the current yearly figure.”

  48. Pl indicate Excel program (MAC version) to use the Piece-wise Hermite Interpolation.

  49. Ravi -
    Try Google.

  50. Clear, concise, and exactly what I needed. Thanks a lot – HJ

  51. hi
    is this formula in excel uses for quadratic spline or another interpolation?

  52. Satria -
    The interpolation formulas I introduced in the post are simple linear interpolations.
    In the discussion, several commenters cited other kinds of fit, including splines.

Subscribe without commenting

Trackbacks

  1. [...] recent post at Jon Peltier’s Blog looks at an “on-sheet” method of performing linear interpolation on a set of tabular [...]

  2. […] More information about piecewise linear interpolation can be found here  and a solution using Excel formulas is available here. […]

Write a Comment

I welcome comments from my readers. If you have an opinion on this post, if you have a question, or if there is anything to add, I want to hear from you. Whether you agree or disagree, please join the discussion. If you want to include an image in your comment, post it on your own site or on one of the many free image sharing sites, and include a link in your comment. I'll download your image and insert the necessary html to display the image inline.

Subscribe without commenting

Peltier Tech Chart Utility

 

Create Excel dashboards quickly with Plug-N-Play reports.

 

Site by: Dawud Miracle, Business Coach & WordPress Websites