Excel Interpolation Formulas

In Getting value on Y axis by putting X axis value on the Mr Excel forum, someone wanted to know how to find in-between values of a function, given some known data points. The approach, of course, is to interpolate values given the known points on either side of the value you need.

Interpolation requires some simple algebra. The diagram below shows two points (blue diamonds connected by a blue line) with coordinates (X1, Y1) and (X2, Y2). We need to find the value of Y corresponding to a given X, represented by the red square at (X, Y).

The smaller triangle with hypotenuse (X1, Y1)-(X, Y) is “similar” to the larger triangle with hypotenuse (X1, Y1)-(X2, Y2), so the sides of the triangles are proportionally sized, leading to the first equation below the sketch. We rearrange this to solve for Y, in the second equation.

Algebraic construction for interpolation

We’ll set up our interpolation in the example below. Our data is in A5:B18, and the known values are plotted as blue diamonds connected by blue lines in the chart.

The analysis has two parts: first we need to determine which pair of points to interpolate between, second we need to do the interpolation. We will judge the validity of our interpolation by plotting the calculated point on the same chart.

Interpolating set-up in worksheet

Solving for Y

I’ve put the calculations above the data table. The yellow shaded cell, A2, holds the known X value, and a formula in cell B2 holds the calculated Y value. Cell A3 indicates which pair of points to interpolate between. The formulas are:

A3: =MATCH(A2,A6:A18)

B2: =INDEX(B6:B18,A3) + (A2-INDEX(A6:A18,A3)) * (INDEX(B6:B18,A3+1)-INDEX(B6:B18,A3)) / (INDEX(A6:A18,A3+1)-INDEX(A6:A18,A3))

We want the Gauge value (Y) when the Flow value (X) equals 3, so this is entered into the yellow shaded cell A2. The formula in A3 tells us that our computed point is between the 7th and 8th data point, and the formula in B2 calculate Y=0.444, and the calculated point (A2, B2) is the red square that lies along the plotted data points. Looks good.

Interpolating to solve for Y - example 1

Te determine the Gauge value for a Flow value of 12, we enter this into A2. The red square moves along the blue line past the 11th point, where Gauge=0.548.

Interpolating to solve for Y - example 2

Solving for X

With a minor rearrangement, we can instead solve for Flow, given a value for Gauge. The known Gauge value is entered into B2, shaded yellow. B3 indicates the pair of points to interpolate between, and A2 provides the value for Flow. The formulas are:

B3: =MATCH(B2,B6:B18)

A2: =INDEX(A6:A18,B3) + (B2-INDEX(B6:B18,B3)) * (INDEX(A6:A18,B3+1)-INDEX(A6:A18,B3)) / (INDEX(B6:B18,B3+1)-INDEX(B6:B18,B3))

For a Gauge (Y) of 0.53, we compute a Flow (X) of 9.790. The red square shows where the calculate point lies along our plotted data.

Interpolating to solve for X - example 1

Changing the Gauge value to 0.35 moves the red square way to the left, to a Flow value of 0.400.

Interpolating to solve for X - example 2

Peltier Tech Chart Utility

Comments

  1. As they say on quiz shows, these things are only easy if you know the answer.
    All that’s missing now is an extra series plotting the horizontal and vertical lines to the axis’s. If you then add the label to the y-intercept points and fix the formatting, you can have a nice box where the user enters a number and the graph shows/tells the y-value

  2. Calvin –

    All you need to do is put error bars, with percentage of 100 in negative X and Y directions, to reach the axes. Then add a data label.

    Interpolating with droplines and labels

  3. What’s the difference between your interpolation formula and FORECAST. Using FORECAST, I get slightly different numbers, but I’m not sure why.

    =FORECAST(A2,OFFSET(INDEX(B6:B18,MATCH(A2,A6:A18)),0,0,2,1),OFFSET(INDEX(A6:A18,MATCH(A2,A6:A18)),0,0,2,1))

  4. Dick –

    I copied your formula exactly and pasted it into my sheet, then tried a bunch of X values. In all cases, the Y values were identical, whether I used your formula or mine. Of course, this is in Excel 2003, back when Excel was taught math.

  5. Is it possible to pull the equation from Trendlines out and use it to extrapolate? So if the best r squared is a polynomial function we could use the trendline equation to extrapolate instead of doing the linear assumption between two points?

    Just wondering…

  6. Oh, that makes me feel better. I hand typed your values and I’ll be there was some precision that was hidden.

  7. Dick – Guess you’re not a very accurate typist.

  8. Eric –

    You can use LINEST to compute the coefficients in the worksheet. See Bernard Liengme’s tutorial: Polynomial regression: How can I fit my X, Y data to a polynomial using LINEST?.

    That said, you shouldn’t just use R² blindly to decide which fit is best. Beyond order 2, a polynomial fit may have a great R², but it also might deviate wildly from the smooth curve you might draw by hand. I’ve written about these issues in Choosing a Trendline Type, Trendline Fitting Errors, and Polynomial Fit vs. Statistical Process Control.

  9. Well, yes, of course. I guess my point was badly made and my real question badly asked.

    If the data turns out to be just badly sampled, say it was some high frequency sine wave and you happen to sample at the right points, your data could look like a flat line would be the best fit line. So perhaps a linear estimation of the point you want to estimate, given the poorly sampled data, would not be the best way of going at it. 99 times out of a 100, though, it probably will.

    The nice thing about trendlines is you can see how good your fit is to the data fairly quickly. We could use all sorts of ways to more accurately figure out the function that the data actually seems to conform to, but Trendlines seems to do this quickly if not precisely.

    Is there some Trendline object that we can get at to get that nice little formula that you can have put onto your chart when you use Trendlines? I since have played with it and found that there is… ActiveChart.SeriesCollection(1).Trendlines(1).DataLabel.Text is what I was looking for. As it is text, you’d have to parse it and so it might be easier to do the other methods you mentioned, but it might be doable.

    Anyways, that’s what I was badly referring to.

  10. why not use trend?
    I use it excessively. use index(match()):index(match()+1) for the X and Y range and you’re done.

    the error bars are a new one – I usually set up a start and end point, but this can be a bother when your axis aren’t at zero.

  11. after posting the trend I wondered:

    I’ve spend many hours trying to find a nice, concise way to do quadratic or cubic interpolation.
    Of course it’s possible to use linest() to find the formula variables… but is there another way?

    btw, another nice feature of the trend function is that is allowas you to do multiple variable linear interpolation.
    When you have a graph with multiple lines for different conditions (say your flow/guage ratio is dependent on temperature. You can find the four points bracketing your search condition using the match() function, and then input a 2×2 matrix as you input X, and a 1×2 as your target X and watch the magic happen!

  12. Using minus error bars at 100% to show the intercept with the axis? That’s awesome, never thought of that!

  13. Ikkeman – Below are some Excel formulas for various types of fit (see Wikipedia interpolation article for more details). These return Y for a given X value, to find X given Y we can just swap the A’s and B’s in the formulas.

    As Jon has said before, despite their popularity polynomial trends are a poor choice for fitting many types of data. The other methods are quite close for the given data as the direction of the curve is not changing much around the estimated points.

    Interp Trend Linear Cubic Spline
    9.789877 0.53174 0.53000 0.53074 0.53025
    0.400368 0.34602 0.35000 0.35336 0.35096

    Trend – Cubic polynomial (equivalent to polynomial trend line order 3):
    =TREND(B6:B18,A6:A18^{1,2,3},A2^{1,2,3})

    Linear – For increasing data (or for decreasing put 1- after first comma):
    =PERCENTILE(B6:B18,PERCENTRANK(A6:A18,A2,20))

    Legendre – A cubic fit based on the surrounding 4 points:*
    =TREND(OFFSET(B6,MATCH(A2,A6:A18)-2,,4),OFFSET(A6,MATCH(A2,A6:A18)-2,,4)^{1,2,3},A2^{1,2,3})

    Spline – Catmull-Rom (equivalent to “smooth line” option):*
    =SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},OFFSET(A6,MATCH(A2,A6:A18)-2,,4)-A2
    )))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},OFFSET(B6,MATCH(A2,A6:A18)-2,,4)))/2

    * for estimation in end intervals you can use extrapolated points at each end:
    A5:B5 = {-0.10022,0.29} A19:B19 = {21.82408,0.616791}

  14. Ihem – thanks for sharing that compact matrix formula for one type of cubic spline. That may be the most impressive non-array formula that I’ve ever seen in Excel. I’d never thought of using the Internal Rate of Return function for fitting non-financial data.

    If anyone ever wants a smoother interpolation than piecewise linear, cubic splines are usually the way to go. Fortunately there are good free add-ins that make cubic splines easy to calculate in Excel.

  15. DaleW – Happy to share, took a while to figure out! It’s a little known fact that you can convert nearly any array formula to a non-array formula with MMULT, eg
    Array =SUM(IF(A1:A5>2,B1:B5)) Non-Array =SUM(IF(MMULT(N(A1:A5>2),1),B1:B5))

    The straight/smooth chart line options have the property that they maintain the same shape if the configuration of points is rotated. This can be seen in the formulas since X on Y followed by Y on X returns the original result with the linear/spline options, but not for trend or Lagrange polynomials (not Legendre, always get those guys mixed up!) VBA code for chart data is here

    Splines have the intuitive appeal that they model the shape taken up by a smooth thin flexible strip passing through the set of points. In the linear case, the strip is pulled taught whereas in the cubic case there is no tension at the endpoints.

  16. Ihem – Interesting, I’d never seen MMULT() used to replace an array formula. In real life, we’d both use =SUMIF() for your example to have error checking, as otherwise a string entry in column A gives what we’d typically consider a bug. MMULT does duplicate even such … features.

    Natural cubic splines or Catmull-Rom cubic splines for exactly 4 points would follow your zero tension at the endpoints analogy. Other interpolating cubic splines have tension. Cardinal splines use piecewise cubic polynomials with a tension parameter ranging from 0 (for Catmull-Rom) to 1 (where they collapse to piecewise linear interpolation). Thus Cardinal splines can replicate straight/smooth lines, or anywhere in between.

  17. DaleW – You make excellent points. For the MMULT method, a better example would be to apply to: =MIN(IF(A1:A5=”a”,B1:B5)). I don’t know of another way to make this type of formula non-array entry, even using INDEX(A1:A5,0) fails.

  18. I find array formulae should be avoided ilke the plague if there’s even a remote possibility anyone else will be using the spreadsheet because you’ll end up tech-supporting it when they hit enter on a cell. Either redesign with extra columns if possible and if not then write a user defined function in one of the modules.

  19. what about if you have columns and rows. for example 6,8,10( in the column) and 8,10,12 (row). and under those is an array of set of values. like if i want to canculate

    8 and 10.4

  20. Luu –

    If the two values are between known values (say, 8.4 and 10.4), you have to interpolate first to get values at 8.4 for 10 and for 12 (two interpolations), then interpolate again between these results to get a value at 10.4.

  21. Hi,

    I’m no mathematician but i have tried my best to follow your logic………..

    I was wondering if there was a way to Interpolate a point on a line between 2 known points. Not knowing the X or Y value? Example of data:

    X Y
    1 6.3 2.12
    2 7.6 2.19
    3 8.6 2.19
    4 9.6 2.16

    On a Scatter with Smooth Lines & Markers graph the peak of the line will be between point 2 & 3. Is there a way to Interpolate that point? (manually graphing it would give me an interpolation point of X=8.0 & Y=2.20) (X rounded to 0.5)

    I appreciate your assistance in this matter!

    Daniel

  22. Daniel –

    You want to find where the smoothed drawn line has its maximum? I usually advise people not to use smoothed lines. They give a false indication of where data may go. They are not based on any physical model of the data, but only on artistically connecting all points in a pretty way, using curve-drawing parameters that “looked nice” to the same kind of designer that gave us Excel’s early color palette. Smoothed lines ignore any measurement error in the data which is actually present, they ignore all points in the data set other than the two being connected and the point on either side, and if markers are not displayed the viewer has no idea where the actual data lies.

    That said, for symmetric points, the usual approach is to fit a quadratic curve (2d order polynomial) to the points, and calculate a maximum using the fitted coefficients. A quadratic fit is probably reasonable, since physical processes often can be modeled by linear and quadratic relationships.

    Your points are not symmetric, so the fitted parabola does not pass through all the points; instead it lies in such a way as to minimize the total deviation of the parabola from the data points. You could fit a third order polynomial to your data and calculate your maximum, but this starts to get into the realm of overfitting.

  23. Daniel,

    Realistically, close to X=8 is your best guess for the maximum.

    If you had really accurate measurements, an approach known as cubic splines would be great for fitting a smooth curve exactly to your data. In fact, the smooth lines that Excel charts draw are one type of cubic spline.

    With four points, Excel can fit a cubic (3rd order) polynomial smoothly and exactly through the points. If you play with that equation, you’ll see it predicts the maximum at just over 8.08, and some people would consider that the best answer. (It’s really close to the natural cubic spline solution, also.)

    Jon’s right that anything past a linear fit without a physical model or lots of supporting data is risky or wishful thinking. But if we are convinced that the data is highly accurate and don’t have a model, cubic splines are commonly used to create the smooth interpolating curve. In Excel, if we have two nearest neighbor points on each side of the region of interest, a cubic polynomial tends to be a very close approximation to the cubic spline fit there.

  24. Hi Jon

    I have you example above to solve for X using data below and it does not work. I followed your example and the subtituted with my data and it gives me an error message on A2. Any idea why?

    X Y
    0.12 3
    0.15 2.67
    0.18 2.45
    0.22 2.24
    0.28 2.02
    0.3 1.94
    0.35 1.82
    0.38 1.74
    0.46 1.6
    0.54 1.48
    0.6 1.42
    0.68 1.34
    0.78 1.26
    0.9 1.19
    1.05 1.12
    1.2 1.06
    1.4 1

  25. Tshepo –

    Did you change the formulas to include your whole data range (mine stopped at row 18, yours continues to row 22)?

    Cell A2 is where you enter an X value for which you want to calculate a Y value. It should not contain an error. What X value were you entering?

    Was the error in cell A3? Was it because the requested X value was outside of the range of your X data?

    A more effective approach for your data might be a power law fit: such a fit with your data has an R² value of 0.9991.
    Power Law Fitted Data

  26. Tshepo Phutha says:

    Hi Jon
    I did expand data to row 22 and I am trying to solve X not Y. I get an error on B3 and A2 but when I use data from your example it work fine.

  27. Tshepo –

    So:
    B2 has the known Y
    B3 =MATCH(B2,B6:B22)
    A2 =INDEX(A6:A22,B3) + (B2-INDEX(B6:B22,B3)) * (INDEX(A6:A22,B3+1)-INDEX(A6:A22,B3)) / (INDEX(B6:B22,B3+1)-INDEX(B6:B22,B3))

    But my formula assumed the range being matched was monotonically increasing, which I never stated. If I include the default value of the optional third parameter, the formula is
    =MATCH(B2,B6:B22,1)
    which tells Excel the data is increasing and to return the index the closest value greater than the sought value. Your formula needs to be
    =MATCH(B2,B6:B22,-1)
    which tells Excel the data is decreasing and to return the index the closest value less than the sought value.

    I still suggest using a power law fit.

  28. Tshepo Phutha says:

    Jon

    Thank you very much for help. I am working on learning on how power fit works.

    Thank you very much

  29. Hi Jon,

    I’m also finding a very strange error using the spline interpolation formula. I am interpolating the following set of data:
    x y
    1.798179868 165.3955251
    1.804183302 165.0206621
    2.230427067 139.8702098
    2.428540367 113.9704961
    2.5966365 65.08009679
    2.752725766 21.02938029
    2.842777266 25.76285129
    2.932828766 20.8522448
    3.022880265 47.45262432
    3.148952365 30.61012109
    3.275024465 26.35386096
    3.443120598 15.5813369
    3.869364363 -31.93964513
    4.031457063 -44.2596589
    4.133515429 -59.06686391
    4.235573796 -77.81866744
    4.313618429 -109.5812402
    4.391663062 -124.4431029
    4.469707695 -185.7945248
    4.547752328 -209.4166568
    4.601783228 -224.5765144
    4.667820995 -267.6385983
    4.733858761 -341.6070167
    4.787889661 -402.0504547
    4.841920561 -424.5273038
    4.895951461 -484.5497172
    4.949982361 -513.475427
    5.376226126 -645.3444851
    5.394236426 -648.4536159
    5.40023986 -650.1201883

    When I interpolate, I find that the following x values generate a #NUM! error:
    4.901954894
    4.901954894
    4.901954894
    All the other values I am trying (less than and/or greater than these, and one in between the first two) seem to interpolate fine.

    The x values at which I am trying to interpolate are part of series of data. I am doing a calculation for some y values, and then interpolating the remaining x values. When I make small changes to the x value range, or use a different number of calculated points, there is often no problem at all. Like the example above, the y values follow a generally decreasing trend, but this is not always the case and y may go up and down. The x values are always monotonically increasing.

    Let me know if you want any additional info. Thanks for any help you can provide in diagnosing this. It is a very vexing problem!

    -Charlie

  30. I forgot to mention above that, when I switched to the “Legendre” interpolation formula, THE EXACT SAME POINTS produced the same #NUM! error.

    Maybe this is a hint about the source of the problem…

    -Charlie

  31. Ooops, a correction… the “Legendre” formula seems to work just fine. I discovered an error that I had made when entering it. The Spline still does not work, however.

    I have created a spreadsheet with the data mentioned above, and a correction to the “problem” data points to help diagnose the problem. It contains the SPLINE and LEGENDRE interpolation formulas, and shows that the SPLINE fails when one of the problem x values is entered.

    You can download it at:
    http://claub.net/temp/Spline_Error_Test.xls

    Thanks for any help!!!!

    -Charlie

  32. Hi CharlieL,

    Alas, this particular formula for this particular cubic spline indeed fails for part of your dataset.

    That segment of your Catmull-Rom spline happens not to be monotonic for predicting Y from X. This causes the IRR() formula to fail rather interestingly in at least 5 subintervals of that segment’s range in X, for X values that are monovalent instead of trivalent!?

    While I was able to calculate and chart the real Catmull-Rom spline directly from its defnition as a t-function-weighted sum of the two nearest points in each direction in X, you possibly don’t want a spline that behaves this way to fit your dataset. Perhaps a natural cubic spline or a monotonically constrained cubic hermite spline would be a better choice? The Newton Excel Bach blog has done a wonderful job setting up VBA code to let Excel users explore a variety of different cubic splines.

  33. PS — Far worse than not being monotonic, it’s not monovalent. This type of spline can reverse direction in a way that gives multiple Y values for the same X value, and the IRR formula doesn’t work in some subsections of segments with that behavior. Wrong spline if you need a unique Y=f(X) fit.

  34. Hi DaleW,

    Thanks for posting an explanation of the problem. I do need a worksheet-formula-only way to do the interpolation (e.g. no VBA or macros), and I would like a smooth curve a la spline or similar. For now I have switched to the Legendre formula, and that seems to work OK for the type of data I am generating. I am still learning about how to implement matrix multiplication via the MMULT function, and I am not up to coming up with something new myself at this point, although I would like to learn how to do it. Also, it’s not clear to me whether the problem that I described in my post was due to a limitation of the Catmull-Rom spline itself, or the implementation via the formula. Can you please elaborate on this?

    I need the interpolated line to go through the points in the data set. You mentioned that this is the “wrong” spline – how about a B-Spline? For instance, the Uniform Cubic B-Spline described here:
    http://en.wikipedia.org/wiki/B-spline#Uniform_cubic_B-splines
    has an explicit matrix form that seems to lend itself well to this approach (from what I can tell). Can you (or someone else reading this post) help me implement it using IIR and MMULT?

    Thanks!

    -Charlie

  35. Charlie,

    B-Splines don’t go through their control points, which is inconvenient when you want to interpolate actual data.

    Catmull-Rom splines do interpolate their control points, but they aren’t constrained by the concept that there should be one and only one Y value for each X value. I believe some IRR() problems have more than one rate of return that solves the constraints, but a “well-defined” mathematical function can only return one value for a given value of X. Thus IRR() fails to capture parts of some Catmull-Rom spline segments, which a weighted sum of the control points in terms of a t parameter going from 0 to 1 in each segment does reveal.

    Assuming you’d like a “well-defined” and smooth continuous interpolating function for Y on X, I would recommend the classic solution of natural cubic splines. Usually this is implemented in programming languages, and I’ve got more than one VBA solution squirreled away. I don’t imagine that solving the sparse symmetric matrix equations for under 50 points would be that hard to program using the cells of a spreadsheet, if one were so inclined.

    The Legendre formula is the cubic polynomial solution, or the simplest natural cubic spline case. The Legendre formula will not give you continuous derivatives if you use it for more than 4 total points. In one mathematical sense, natural cubic splines offer the smoothest possible interpolation. Natural cubic splines aren’t likely to be used for graphics because it isn’t a local solution like Legendre or Catmull-Rom, and every point instead of just 4 local points can have some impact every segment of the spline, making it more computationally intensive.

  36. Hi DaleW,

    Now that I checked into it further, you are correct about B-splines not passing though each data point. A cubic Spline would be the best but my data set can be up to 600 points, and doing the large matrix for that would be too much overhead. The Lagrange interpolation seems to be “good enough” for me, despite the occasional cusp in the interpolation where there is a derivative discontinuity.

    -Charlie

  37. Charlie,

    Good enough!

    Incidentally, your dataset can be used to show that Excel must modify the Catmull-Rom spline algorithm to draw its smoothed lines, because Excel smoothed lines usually overlap that spline yet avoid the pathology that you found. As Answers.Microsoft notes, “An adjustment to the basic method is applied if consecutive points are a factor of three or more apart so as to reduce the amount by which the curve overshoots points.” I interpret this as a sort of false position method, where any outer control point that further that 3x the distance between the inner control points is pulled closer. Thus on your troublesome (X,Y) interval between (4.895951461, -484.5497172) and (4.949982361, -513.475427), the next point is treated as if it were closer at (5.2305,-600.2523) for the purposes of the Catmull-Rom algorithm on that segment only, which removes the strange behavior and restores single-valued functionality and lets the IRR() formula work, as well as duplicates the Excel smoothed line exactly on my plot. This might be more trouble than it is worth for you to duplicate Excel’s native interpolating spline — but it would be a lot easier than solving the tridiagonal matrix equation for natural cubic splines. You’d just need to create a table of the 4 XY control points for each of N-3 inner segments where you say your full table is N=600, and the Catmull-Rom OFFSET logic would have to point to that table. This would provide C1 continuity (of 1st derivatives), but not the C2 continuity (of 2nd derivatives also) that is provided by natural cubic splines. Your Legendre solution only has C0 continuity (of the function itself) at the data points (but C2 continuity elsewhere like all cubic splines within each segment, not necessarily between segments).

    BTW, are you sure you want spline interpolation instead of LOESS smoothing?

  38. Dale –

    Brian Murphy of UT Austin worked out bezier algorithms that seem to have been used for smoothing series lines in Excel charts. I think the details can be found at xlrotor.com.

  39. Jon – yes, I saw that. I believe that the Bezier algorithms may be very similar, but MS Excel actually uses modified Catmull-Rom splines, per the July 4, 2011 post by Answers.Microsoft in response to Tushar Mehta. The VBA code is there, but I think that I did duplicate quickly the Excel charting smooth curve for this troublesome case by using a false position for a far outer control point as I tried to describe using both Catmull-Rom formulas (the quick IRR() one discussed on your blog, and the formal definition in terms of weighted sums of the 4 basis points using t-parameter polynomials).

  40. Charlie (and Jon),

    Those Bezier algorithms match Excel’s smoothing to within a few pixels for the troublesome segment. They are in VBA and parametric, so it might take quite a bit of work to implement them to evaluate a particular X value without VBA.

    Without using VBA, I picked up on your idea and implemented Ihem’s IRR() and MMULT() formula for Catmull Rom with more or less the modification suggested by Lori Miller for highly unevenly spaced data, and it looks good to within a pixel or two compared to Excel’s native curve fitting, even for your segment where simple Catmull Rom misbehaved. The key formula to find the adjusted spline Y for any X had to be switched to an array formula to work for me:

    {=SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},(TRANSPOSE(OFFSET(xQuad,$Q5-1,0,1,4))-$O5))))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},TRANSPOSE(OFFSET(yQuad,$Q5-1,0,1,4))))/2}

    The transpose was necessary with matrix logic to accommodate a table (starting at xQuad and then yQuad) of possibly linear-interpolation adjusted outer influence points created by IF logic, with 4 XY points defined in 8 columns shortly after each inner interval as defined by the original XY data columns in A & B. None of this extra effort is needed if the X data is roughly evenly spaced, but your data wasn’t. The Q column holds the Index from Match(), again with slight modification for the endpoints. The O column held the arbitrary X points to evaluate.

  41. CharleyL – Interesting example and glad the suggested formulas have been useful to you. Strangely, in my Excel 2010 version the spline formula does actually return values for the cases highlighted but further investigation reveals this is due to improvements in the IRR implementation in this version. I find the formula matches the curve for the most part but there are a few spurious values where it is picking up another root. An alternative is to use the more natural version of the formula with 1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1} replaced by IRR(MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1} which does return the correct values for your example. I seem to remember the point of using 1/IRR of the reversed matrix was related to avoiding 0^0 errors when the formula was evaluated exactly at a data point but these can be easily dealt with by adding a negligible constant like 1E-99. Note also the formula is for a 2D spline but could also be modified to a 1D version which would ensure single-valued results. In many cases where data is fairly evenly spaced there won’t be a large difference between these two methods.

    DaleW – Nice analysis. The modification that the charting algorithm makes can be seen in various ways. Suppose there are four consecutive points A,B,C,D and you want to construct the curve segment between B and C. In terms of Hermite splines the tension parameter implemented by the chart algorithm is chosen to be t=max(3,|AC|/|BC| ,|BD|/|BC|)/6 (based on this definition of tension and where |.| denotes distance between points as measured on the chart ). In terms of Bezier curves this condition enforces that the control points are never more than a distance |BC|/2 from B or C so do not cross each other. For most normal cases where points are fairly evenly spaced this reduces to the Catmull-Rom case (t=1/2) . Doug Jenkins has many more useful posts and VBA functions on these topics which are well worth looking at as you say – there are a few formulas and comments related to the connection with natural splines are here.

    Jon – The xlrotor code is normally fairly close but can deviate quite substantially in some cases. For example entering these (x,y) values: (9,8) (10,12) (11,12) (14,6) in the Bezier example file link gives clearly incorrect values for 10 < x < 11. The vba code posted at answers.microsoft may be the first accurate algorithm available publicly.

  42. Hi Jon Peltier et al.-
    I’m not a math expert. But after trying to fit your equations for the last 2 hrs, finally I gave up! And really don’t understand why the interpolation is not working for my database (excel file link below)

    http://www.mediafire.com/view/?hdpdpu641qdpxak

    I would be very glad if I find any solution. Thanks- Rajib

  43. There was no data in the posted worksheet, so I couldn’t interpolate values either.

  44. Sorry Jon Peltier, There was something wrong. Please find the data in excel as well as in a tab delimited .txt file below:

    http://www.mediafire.com/view/?1bo9he3c3jz05u7
    http://www.mediafire.com/view/?tmyuta23ajjj27i

    Best/
    Rajib

  45. The simple linear interpolation methodology worked for me. I used the equations to linearly interpolate y values for a column of x values.
    Many Thanks

  46. I’ve been trying to get this to work with a very small set of yearly data so:
    X = year in the format 2001
    Y = a price
    I have this data for 12 years
    i want to estimate a monthly value in any of those twelve years.
    However, i’m struggling to do this with your spreadsheet.
    For example i tried 2001.083 which could predicate a month
    however i get a ref error in A2 and i dont think this is the best way of doing it.

    Ultimately i want to try and estimate each monthly figure based on a set of yearly final figure. So the value of months 1 through 12 should equal the current yearly figure.
    Hope this makes sens.e

  47. Cell A2 should have no error, because it’s the cell you enter the X value for which you want to determine a Y value. So with 2001.083 in A2 you should get a value that is 1/12 of the way from the 2001 value to the 2002 value in cell B2.

    Also, this sentence doesn’t make sense if you’re trying to estimate monthly values:
    “So the value of months 1 through 12 should equal the current yearly figure.”

  48. Pl indicate Excel program (MAC version) to use the Piece-wise Hermite Interpolation.

  49. Ravi –
    Try Google.

  50. Clear, concise, and exactly what I needed. Thanks a lot – HJ

  51. hi
    is this formula in excel uses for quadratic spline or another interpolation?

  52. Satria –
    The interpolation formulas I introduced in the post are simple linear interpolations.
    In the discussion, several commenters cited other kinds of fit, including splines.

  53. Please any one can do it for me or tell me how to do this
    regards
    maak_88@yahoo.com

    y x X Values to be interpolated

    1 99.72183534 99.56709957
    2 99.4428926 99.13793103
    3 99.1631652 98.71244635
    4 98.88264649 98.29059829
    5 98.60132972 97.87234043
    6 98.31920803 97.45762712
    7 98.03627447 97.0464135
    8 97.75252199 96.63865546
    9 97.46794345 96.23430962
    10 97.18253158 95.83333333
    11 96.89627902 95.43568465
    12 96.60917831 95.04132231
    13 96.32122185 94.65020576
    14 96.03240194 94.26229508
    15 95.74271078 93.87755102
    16 95.45214042 93.49593496
    17 95.16068282 93.11740891
    18 94.86832981 92.74193548
    19 94.57507306 92.36947791
    20 94.28090416 92
    21 93.98581453 91.63346614
    22 93.68979548 91.26984127
    23 93.39283817 90.90909091
    24 93.09493363 90.5511811
    25 92.79607271 90.19607843
    26 92.49624617 89.84375
    27 92.19544457 89.49416342
    28 91.89365835 89.14728682
    29 91.59087776 88.8030888
    30 91.28709292 88.46153846
    31 90.98229376 88.12260536
    32 90.67647006 87.78625954
    33 90.36961141 87.45247148
    34 90.06170724 87.12121212
    35 89.75274679 86.79245283
    36 89.4427191 86.46616541
    37 89.13161305 86.1423221
    38 88.8194173 85.82089552
    39 88.50612032 85.50185874
    40 88.19171037 85.18518519
    41 87.87617551 84.87084871
    42 87.55950358 84.55882353
    43 87.24168219 84.24908425
    44 86.92269874 83.94160584
    45 86.60254038 83.63636364
    46 86.28119404 83.33333333
    47 85.95864639 83.03249097
    48 85.63488386 82.73381295
    49 85.30989261 82.43727599
    50 84.98365856 82.14285714
    51 84.65616733 81.85053381
    52 84.32740427 81.56028369
    53 83.99735446 81.27208481
    54 83.66600265 80.98591549
    55 83.33333333 80.70175439
    56 82.99933065 80.41958042
    57 82.66397845 80.13937282
    58 82.32726023 79.86111111
    59 81.98915917 79.58477509
    60 81.64965809 79.31034483
    61 81.30873945 79.03780069
    62 80.96638534 78.76712329
    63 80.62257748 78.49829352
    64 80.27729719 78.23129252
    65 79.93052539 77.96610169
    66 79.58224258 77.7027027
    67 79.23242883 77.44107744
    68 78.88106377 77.18120805
    69 78.5281266 76.92307692
    70 78.173596 76.66666667
    71 77.8174502 76.41196013
    72 77.45966692 76.1589404
    73 77.10022337 75.90759076
    74 76.73909622 75.65789474
    75 76.37626158 75.40983607
    76 76.01169501 75.16339869
    77 75.64537145 74.91856678
    78 75.27726527 74.67532468
    79 74.90735018 74.43365696
    80 74.53559925 74.19354839
    81 74.16198487 73.95498392
    82 73.78647874 73.71794872
    83 73.40905182 73.48242812
    84 73.02967433 73.24840764
    85 72.64831573 73.01587302
    86 72.26494463 72.78481013
    87 71.87952884 72.55520505
    88 71.4920353 72.32704403
    89 71.10243003 72.10031348
    90 70.71067812 71.875
    91 70.3167437 71.65109034
    92 69.92058988 71.42857143
    93 69.52217872 71.20743034
    94 69.12147118 70.98765432
    95 68.71842709 70.76923077
    96 68.31300511 70.55214724
    97 67.90516262 70.33639144
    98 67.49485577 70.12195122
    99 67.08203932 69.90881459
    100 66.66666667 69.6969697
    101 66.24868971 69.48640483
    102 65.82805886 69.27710843
    103 65.4047229 69.06906907
    104 64.97862897 68.86227545
    105 64.54972244 68.65671642
    106 64.11794687 68.45238095
    107 63.68324392 68.24925816
    108 63.2455532 68.04733728
    109 62.80481227 67.84660767
    110 62.36095645 67.64705882
    111 61.91391874 67.44868035
    112 61.46362972 67.25146199
    113 61.01001739 67.05539359
    114 60.55300708 66.86046512
    115 60.09252126 66.66666667
    116 59.6284794 66.47398844
    117 59.16079783 66.28242075
    118 58.68938954 66.09195402
    119 58.21416399 65.9025788
    120 57.73502692 65.71428571
    121 57.25188012 65.52706553
    122 56.76462122 65.34090909
    123 56.27314339 65.15580737
    124 55.7773351 64.97175141
    125 55.27707984 64.78873239
    126 54.77225575 64.60674157
    127 54.26273532 64.42577031
    128 53.74838499 64.24581006
    129 53.22906474 64.06685237
    130 52.70462767 63.88888889
    131 52.17491947 63.71191136
    132 51.63977795 63.5359116
    133 51.09903239 63.36088154
    134 50.55250296 63.18681319
    135 50 63.01369863
    136 49.44132325 62.84153005
    137 48.876261 62.67029973
    138 48.30458915 62.5
    139 47.72607021 62.33062331
    140 47.14045208 62.16216216
    141 46.54746681 61.99460916
    142 45.94682917 61.82795699
    143 45.33823503 61.66219839
    144 44.72135955 61.4973262
    145 44.09585518 61.33333333
    146 43.46134937 61.17021277
    147 42.81744193 61.00795756
    148 42.16370214 60.84656085
    149 41.49966533 60.68601583
    150 40.82482905 60.52631579
    151 40.1386486 60.36745407
    152 39.44053189 60.20942408
    153 38.72983346 60.05221932
    154 38.0058475 59.89583333
    155 37.26779962 59.74025974
    156 36.51483717 59.58549223
    157 35.74601765 59.43152455
    158 34.96029494 59.27835052
    159 34.15650255 59.12596401
    160 33.33333333 58.97435897
    161 32.48931448 58.82352941
    162 31.6227766 58.67346939
    163 30.73181486 58.52417303
    164 29.8142397 58.37563452
    165 28.86751346 58.2278481
    166 27.88866755 58.08080808
    167 26.87419249 57.93450882
    168 25.81988897 57.78894472
    169 24.72066162 57.64411028
    170 23.57022604 57.5
    171 22.36067977 57.35660848
    172 21.08185107 57.21393035
    173 19.72026594 57.0719603
    174 18.25741858 56.93069307
    175 16.66666667 56.79012346
    176 14.90711985 56.65024631
    177 12.90994449 56.51105651
    178 10.54092553 56.37254902
    179 7.453559925 56.23471883
    180 0 56.09756098
    55.96107056
    55.82524272
    55.69007264
    55.55555556
    55.42168675
    55.28846154
    55.1558753
    55.02392344
    54.89260143
    54.76190476
    54.63182898
    54.50236967
    54.37352246
    54.24528302
    54.11764706
    53.99061033
    53.86416862
    53.73831776
    53.61305361
    53.48837209
    53.36426914
    53.24074074
    53.11778291
    52.99539171
    52.87356322
    52.75229358
    52.63157895
    52.51141553
    52.39179954
    52.27272727
    52.15419501
    52.0361991
    51.91873589
    51.8018018
    51.68539326
    51.56950673
    51.4541387
    51.33928571
    51.22494432
    51.11111111
    50.99778271
    50.88495575
    50.77262693
    50.66079295
    50.54945055
    50.43859649
    50.32822757
    50.21834061
    50.10893246
    50
    49.89154013
    49.78354978
    49.67602592
    49.56896552
    49.46236559
    49.35622318
    49.25053533
    49.14529915
    49.04051173
    48.93617021
    48.83227176
    48.72881356
    48.62579281
    48.52320675
    48.42105263
    48.31932773
    48.21802935
    48.11715481
    48.01670146
    47.91666667
    47.81704782
    47.71784232
    47.61904762
    47.52066116
    47.42268041
    47.32510288
    47.22792608
    47.13114754
    47.03476483
    46.93877551
    46.84317719
    46.74796748
    46.65314402
    46.55870445
    46.46464646
    46.37096774
    46.277666
    46.18473896
    46.09218437
    46

  54. Maak –

    I presume you want a new Y value for each X value in your second list.

    I put the pairs of X and Y into X-Y order, not Y-X order (columns A:B, see screenshot below).

    I put the new X values into another column E.

    I used column D for an index, which points to the value in the column of input X values closest to but greater than the new X value. The formula in D2 is as follows, and copied down as far as there are values in column E:

    =MATCH(E2,$A$2:$A$181,-1)

    The third argument, -1, is needed because the input X values are in decreasing order.

    I used column F for the interpolated Y values. The formula in F2 is as follows, copied down as far as needed:

    =(E2-INDEX($A$2:$A$181,D2))/(INDEX($A$2:$A$181,D2+1)-INDEX($A$2:$A$181,D2))*(INDEX($B$2:$B$181,D2+1)-INDEX($B$2:$B$181,D2))+INDEX($B$2:$B$181,D2)

    It’s basically the same formula as in cell B2 of the example.

  55. Hi,
    I see that the graph never fall back (it ascends). But what if when the curve falls back (quadratic – so that each y value will have two x values). How can we interpolate for x for a given y value? Given: y=0.61357555
    The correct answer is x=867.5618709 with this formula: =A20+((L3-B20)/(B21-B20))*(A21-A20). But I don’t need to manually search for the lower and upper bounds of the given value in y and corresponding x. I need the first (smaller x value).

    Thanks,

    Here is the data.

    y x
    37.18656
    0.255633 103.88304
    0.250945 188.28446
    0.250528 262.19693
    0.259958 375.86513
    0.327046 492.14553
    0.429768 630.57233
    0.584134 782.98033
    0.645869 960.33653
    0.640969 1137.91573
    0.610922 1296.88933
    0.588545 1448.95363
    0.598201 1595.24113
    0.586454 1767.37863
    0.579821 1920.10453
    0.513784 2063.63463
    0.509153 2206.45073
    0.515956 2366.94803
    0.470963 2507.86673
    0.440226 2668.59793
    0.415335 2784.40273
    0.357285 2873.33485
    0.338649 2966.22146
    0.331126 3032.07235
    0.331747 3082.98812
    0.299145 3146.48242
    0.291289 3222.21814
    0.287335 3268.67082

  56. 37.18656 is under X while the corresponding Y value is missing.

  57. Hi can you help me in this problem, I need to determine the values between minimun x =1 and the maximum x = -1,1 y = 46,99 and maximum y =139…my result it is a linear chart and should be a curve…I tried to determine the values like a progression but, it is clear it ´snt the way…

    x_Coord1 = 1 and x_Coord2 = -1,1

    y_Coord1 = 46,99 and y_Coord2= 139

    x_Coord1= 2 and x_Coord2= 9,8

    y_Coord1 = 44,23 and y_Coord2= 218
    x_Coord1 3 and x_Coord2 10,4
    x_Coord1 4 and x_Coord2 9,2
    x_Coord1 5 and x_Coord2 7,5
    x_Coord1 6 and x_Coord2 8,8
    x_Coord1 7 and x_Coord2 8
    y_Coord1 33,96 and y_Coord2 174
    y_Coord1 30,36 and y_Coord2 192
    y_Coord1 27,69 and y_Coord2 184
    y_Coord1 23,84 and y_Coord2 177
    y_Coord1 18,86 and y_Coord2 165

  58. I found a solution in the comments, thanks

Subscribe without commenting

Trackbacks

  1. […] recent post at Jon Peltier’s Blog looks at an “on-sheet” method of performing linear interpolation on a set of tabular […]

  2. […] More information about piecewise linear interpolation can be found here  and a solution using Excel formulas is available here. […]

Write a Comment

I welcome comments from my readers. If you have an opinion on this post, if you have a question, or if there is anything to add, I want to hear from you. Whether you agree or disagree, please join the discussion. If you want to include an image in your comment, post it on your own site or on one of the many free image sharing sites, and include a link in your comment. I'll download your image and insert the necessary html to display the image inline.

Subscribe without commenting

Peltier Tech Chart Utility

 

Create Excel dashboards quickly with Plug-N-Play reports.