In Getting value on Y axis by putting X axis value on the Mr Excel forum, someone wanted to know how to find in-between values of a function, given some known data points. The approach, of course, is to interpolate values given the known points on either side of the value you need.

Interpolation requires some simple algebra. The diagram below shows two points (blue diamonds connected by a blue line) with coordinates (X1, Y1) and (X2, Y2). We need to find the value of Y corresponding to a given X, represented by the red square at (X, Y).

The smaller triangle with hypotenuse (X1, Y1)-(X, Y) is “similar” to the larger triangle with hypotenuse (X1, Y1)-(X2, Y2), so the sides of the triangles are proportionally sized, leading to the first equation below the sketch. We rearrange this to solve for Y, in the second equation.

We’ll set up our interpolation in the example below. Our data is in A5:B18, and the known values are plotted as blue diamonds connected by blue lines in the chart.

The analysis has two parts: first we need to determine which pair of points to interpolate between, second we need to do the interpolation. We will judge the validity of our interpolation by plotting the calculated point on the same chart.

## Solving for Y

I’ve put the calculations above the data table. The yellow shaded cell, A2, holds the known X value, and a formula in cell B2 holds the calculated Y value. Cell A3 indicates which pair of points to interpolate between. The formulas are:

`A3: =MATCH(A2,A6:A18)`

`B2: =INDEX(B6:B18,A3) + (A2-INDEX(A6:A18,A3)) * (INDEX(B6:B18,A3+1)-INDEX(B6:B18,A3)) / (INDEX(A6:A18,A3+1)-INDEX(A6:A18,A3))`

We want the Gauge value (Y) when the Flow value (X) equals 3, so this is entered into the yellow shaded cell A2. The formula in A3 tells us that our computed point is between the 7th and 8th data point, and the formula in B2 calculate Y=0.444, and the calculated point (A2, B2) is the red square that lies along the plotted data points. Looks good.

Te determine the Gauge value for a Flow value of 12, we enter this into A2. The red square moves along the blue line past the 11th point, where Gauge=0.548.

## Solving for X

With a minor rearrangement, we can instead solve for Flow, given a value for Gauge. The known Gauge value is entered into B2, shaded yellow. B3 indicates the pair of points to interpolate between, and A2 provides the value for Flow. The formulas are:

`B3: =MATCH(B2,B6:B18)`

`A2: =INDEX(A6:A18,B3) + (B2-INDEX(B6:B18,B3)) * (INDEX(A6:A18,B3+1)-INDEX(A6:A18,B3)) / (INDEX(B6:B18,B3+1)-INDEX(B6:B18,B3))`

For a Gauge (Y) of 0.53, we compute a Flow (X) of 9.790. The red square shows where the calculate point lies along our plotted data.

Changing the Gauge value to 0.35 moves the red square way to the left, to a Flow value of 0.400.

As they say on quiz shows, these things are only easy if you know the answer.

All that’s missing now is an extra series plotting the horizontal and vertical lines to the axis’s. If you then add the label to the y-intercept points and fix the formatting, you can have a nice box where the user enters a number and the graph shows/tells the y-value

Calvin –

All you need to do is put error bars, with percentage of 100 in negative X and Y directions, to reach the axes. Then add a data label.

What’s the difference between your interpolation formula and FORECAST. Using FORECAST, I get slightly different numbers, but I’m not sure why.

=FORECAST(A2,OFFSET(INDEX(B6:B18,MATCH(A2,A6:A18)),0,0,2,1),OFFSET(INDEX(A6:A18,MATCH(A2,A6:A18)),0,0,2,1))

Dick –

I copied your formula exactly and pasted it into my sheet, then tried a bunch of X values. In all cases, the Y values were identical, whether I used your formula or mine. Of course, this is in Excel 2003, back when Excel was taught math.

Is it possible to pull the equation from Trendlines out and use it to extrapolate? So if the best r squared is a polynomial function we could use the trendline equation to extrapolate instead of doing the linear assumption between two points?

Just wondering…

Oh, that makes me feel better. I hand typed your values and I’ll be there was some precision that was hidden.

Dick – Guess you’re not a very accurate typist.

Eric –

You can use LINEST to compute the coefficients in the worksheet. See Bernard Liengme’s tutorial: Polynomial regression: How can I fit my X, Y data to a polynomial using LINEST?.

That said, you shouldn’t just use R² blindly to decide which fit is best. Beyond order 2, a polynomial fit may have a great R², but it also might deviate wildly from the smooth curve you might draw by hand. I’ve written about these issues in Choosing a Trendline Type, Trendline Fitting Errors, and Polynomial Fit vs. Statistical Process Control.

Well, yes, of course. I guess my point was badly made and my real question badly asked.

If the data turns out to be just badly sampled, say it was some high frequency sine wave and you happen to sample at the right points, your data could look like a flat line would be the best fit line. So perhaps a linear estimation of the point you want to estimate, given the poorly sampled data, would not be the best way of going at it. 99 times out of a 100, though, it probably will.

The nice thing about trendlines is you can see how good your fit is to the data fairly quickly. We could use all sorts of ways to more accurately figure out the function that the data actually seems to conform to, but Trendlines seems to do this quickly if not precisely.

Is there some Trendline object that we can get at to get that nice little formula that you can have put onto your chart when you use Trendlines? I since have played with it and found that there is… ActiveChart.SeriesCollection(1).Trendlines(1).DataLabel.Text is what I was looking for. As it is text, you’d have to parse it and so it might be easier to do the other methods you mentioned, but it might be doable.

Anyways, that’s what I was badly referring to.

why not use trend?

I use it excessively. use index(match()):index(match()+1) for the X and Y range and you’re done.

the error bars are a new one – I usually set up a start and end point, but this can be a bother when your axis aren’t at zero.

after posting the trend I wondered:

I’ve spend many hours trying to find a nice, concise way to do quadratic or cubic interpolation.

Of course it’s possible to use linest() to find the formula variables… but is there another way?

btw, another nice feature of the trend function is that is allowas you to do multiple variable linear interpolation.

When you have a graph with multiple lines for different conditions (say your flow/guage ratio is dependent on temperature. You can find the four points bracketing your search condition using the match() function, and then input a 2×2 matrix as you input X, and a 1×2 as your target X and watch the magic happen!

Using minus error bars at 100% to show the intercept with the axis? That’s awesome, never thought of that!

Ikkeman – Below are some Excel formulas for various types of fit (see Wikipedia interpolation article for more details). These return Y for a given X value, to find X given Y we can just swap the A’s and B’s in the formulas.

As Jon has said before, despite their popularity polynomial trends are a poor choice for fitting many types of data. The other methods are quite close for the given data as the direction of the curve is not changing much around the estimated points.

Interp Trend Linear Cubic Spline

9.789877 0.53174 0.53000 0.53074 0.53025

0.400368 0.34602 0.35000 0.35336 0.35096

Trend – Cubic polynomial (equivalent to polynomial trend line order 3):

=TREND(B6:B18,A6:A18^{1,2,3},A2^{1,2,3})

Linear – For increasing data (or for decreasing put 1- after first comma):

=PERCENTILE(B6:B18,PERCENTRANK(A6:A18,A2,20))

Legendre – A cubic fit based on the surrounding 4 points:*

=TREND(OFFSET(B6,MATCH(A2,A6:A18)-2,,4),OFFSET(A6,MATCH(A2,A6:A18)-2,,4)^{1,2,3},A2^{1,2,3})

Spline – Catmull-Rom (equivalent to “smooth line” option):*

=SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},OFFSET(A6,MATCH(A2,A6:A18)-2,,4)-A2

)))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},OFFSET(B6,MATCH(A2,A6:A18)-2,,4)))/2

* for estimation in end intervals you can use extrapolated points at each end:

A5:B5 = {-0.10022,0.29} A19:B19 = {21.82408,0.616791}

Ihem – thanks for sharing that compact matrix formula for one type of cubic spline. That may be the most impressive non-array formula that I’ve ever seen in Excel. I’d never thought of using the Internal Rate of Return function for fitting non-financial data.

If anyone ever wants a smoother interpolation than piecewise linear, cubic splines are usually the way to go. Fortunately there are good free add-ins that make cubic splines easy to calculate in Excel.

DaleW – Happy to share, took a while to figure out! It’s a little known fact that you can convert nearly any array formula to a non-array formula with MMULT, eg

Array =SUM(IF(A1:A5>2,B1:B5)) Non-Array =SUM(IF(MMULT(N(A1:A5>2),1),B1:B5))

The straight/smooth chart line options have the property that they maintain the same shape if the configuration of points is rotated. This can be seen in the formulas since X on Y followed by Y on X returns the original result with the linear/spline options, but not for trend or Lagrange polynomials (not Legendre, always get those guys mixed up!) VBA code for chart data is here

Splines have the intuitive appeal that they model the shape taken up by a smooth thin flexible strip passing through the set of points. In the linear case, the strip is pulled taught whereas in the cubic case there is no tension at the endpoints.

Ihem – Interesting, I’d never seen MMULT() used to replace an array formula. In real life, we’d both use =SUMIF() for your example to have error checking, as otherwise a string entry in column A gives what we’d typically consider a bug. MMULT does duplicate even such … features.

Natural cubic splines or Catmull-Rom cubic splines for exactly 4 points would follow your zero tension at the endpoints analogy. Other interpolating cubic splines have tension. Cardinal splines use piecewise cubic polynomials with a tension parameter ranging from 0 (for Catmull-Rom) to 1 (where they collapse to piecewise linear interpolation). Thus Cardinal splines can replicate straight/smooth lines, or anywhere in between.

DaleW – You make excellent points. For the MMULT method, a better example would be to apply to: =MIN(IF(A1:A5=”a”,B1:B5)). I don’t know of another way to make this type of formula non-array entry, even using INDEX(A1:A5,0) fails.

I find array formulae should be avoided ilke the plague if there’s even a remote possibility anyone else will be using the spreadsheet because you’ll end up tech-supporting it when they hit enter on a cell. Either redesign with extra columns if possible and if not then write a user defined function in one of the modules.

what about if you have columns and rows. for example 6,8,10( in the column) and 8,10,12 (row). and under those is an array of set of values. like if i want to canculate

8 and 10.4

Luu –

If the two values are between known values (say, 8.4 and 10.4), you have to interpolate first to get values at 8.4 for 10 and for 12 (two interpolations), then interpolate again between these results to get a value at 10.4.

Hi,

I’m no mathematician but i have tried my best to follow your logic………..

I was wondering if there was a way to Interpolate a point on a line between 2 known points. Not knowing the X or Y value? Example of data:

X Y

1 6.3 2.12

2 7.6 2.19

3 8.6 2.19

4 9.6 2.16

On a Scatter with Smooth Lines & Markers graph the peak of the line will be between point 2 & 3. Is there a way to Interpolate that point? (manually graphing it would give me an interpolation point of X=8.0 & Y=2.20) (X rounded to 0.5)

I appreciate your assistance in this matter!

Daniel

Daniel –

You want to find where the smoothed drawn line has its maximum? I usually advise people not to use smoothed lines. They give a false indication of where data may go. They are not based on any physical model of the data, but only on artistically connecting all points in a pretty way, using curve-drawing parameters that “looked nice” to the same kind of designer that gave us Excel’s early color palette. Smoothed lines ignore any measurement error in the data which is actually present, they ignore all points in the data set other than the two being connected and the point on either side, and if markers are not displayed the viewer has no idea where the actual data lies.

That said, for symmetric points, the usual approach is to fit a quadratic curve (2d order polynomial) to the points, and calculate a maximum using the fitted coefficients. A quadratic fit is probably reasonable, since physical processes often can be modeled by linear and quadratic relationships.

Your points are not symmetric, so the fitted parabola does not pass through all the points; instead it lies in such a way as to minimize the total deviation of the parabola from the data points. You could fit a third order polynomial to your data and calculate your maximum, but this starts to get into the realm of overfitting.

Daniel,

Realistically, close to X=8 is your best guess for the maximum.

If you had really accurate measurements, an approach known as cubic splines would be great for fitting a smooth curve exactly to your data. In fact, the smooth lines that Excel charts draw are one type of cubic spline.

With four points, Excel can fit a cubic (3rd order) polynomial smoothly and exactly through the points. If you play with that equation, you’ll see it predicts the maximum at just over 8.08, and some people would consider that the best answer. (It’s really close to the natural cubic spline solution, also.)

Jon’s right that anything past a linear fit without a physical model or lots of supporting data is risky or wishful thinking. But if we are convinced that the data is highly accurate and don’t have a model, cubic splines are commonly used to create the smooth interpolating curve. In Excel, if we have two nearest neighbor points on each side of the region of interest, a cubic polynomial tends to be a very close approximation to the cubic spline fit there.

Hi Jon

I have you example above to solve for X using data below and it does not work. I followed your example and the subtituted with my data and it gives me an error message on A2. Any idea why?

X Y

0.12 3

0.15 2.67

0.18 2.45

0.22 2.24

0.28 2.02

0.3 1.94

0.35 1.82

0.38 1.74

0.46 1.6

0.54 1.48

0.6 1.42

0.68 1.34

0.78 1.26

0.9 1.19

1.05 1.12

1.2 1.06

1.4 1

Tshepo –

Did you change the formulas to include your whole data range (mine stopped at row 18, yours continues to row 22)?

Cell A2 is where you enter an X value for which you want to calculate a Y value. It should not contain an error. What X value were you entering?

Was the error in cell A3? Was it because the requested X value was outside of the range of your X data?

A more effective approach for your data might be a power law fit: such a fit with your data has an R² value of 0.9991.

Hi Jon

I did expand data to row 22 and I am trying to solve X not Y. I get an error on B3 and A2 but when I use data from your example it work fine.

Tshepo –

So:

B2 has the known Y

B3 =MATCH(B2,B6:B22)

A2 =INDEX(A6:A22,B3) + (B2-INDEX(B6:B22,B3)) * (INDEX(A6:A22,B3+1)-INDEX(A6:A22,B3)) / (INDEX(B6:B22,B3+1)-INDEX(B6:B22,B3))

But my formula assumed the range being matched was monotonically increasing, which I never stated. If I include the default value of the optional third parameter, the formula is

=MATCH(B2,B6:B22,1)

which tells Excel the data is increasing and to return the index the closest value greater than the sought value. Your formula needs to be

=MATCH(B2,B6:B22,-1)

which tells Excel the data is decreasing and to return the index the closest value less than the sought value.

I still suggest using a power law fit.

Jon

Thank you very much for help. I am working on learning on how power fit works.

Thank you very much

Hi Jon,

I’m also finding a very strange error using the spline interpolation formula. I am interpolating the following set of data:

x y

1.798179868 165.3955251

1.804183302 165.0206621

2.230427067 139.8702098

2.428540367 113.9704961

2.5966365 65.08009679

2.752725766 21.02938029

2.842777266 25.76285129

2.932828766 20.8522448

3.022880265 47.45262432

3.148952365 30.61012109

3.275024465 26.35386096

3.443120598 15.5813369

3.869364363 -31.93964513

4.031457063 -44.2596589

4.133515429 -59.06686391

4.235573796 -77.81866744

4.313618429 -109.5812402

4.391663062 -124.4431029

4.469707695 -185.7945248

4.547752328 -209.4166568

4.601783228 -224.5765144

4.667820995 -267.6385983

4.733858761 -341.6070167

4.787889661 -402.0504547

4.841920561 -424.5273038

4.895951461 -484.5497172

4.949982361 -513.475427

5.376226126 -645.3444851

5.394236426 -648.4536159

5.40023986 -650.1201883

When I interpolate, I find that the following x values generate a #NUM! error:

4.901954894

4.901954894

4.901954894

All the other values I am trying (less than and/or greater than these, and one in between the first two) seem to interpolate fine.

The x values at which I am trying to interpolate are part of series of data. I am doing a calculation for some y values, and then interpolating the remaining x values. When I make small changes to the x value range, or use a different number of calculated points, there is often no problem at all. Like the example above, the y values follow a generally decreasing trend, but this is not always the case and y may go up and down. The x values are always monotonically increasing.

Let me know if you want any additional info. Thanks for any help you can provide in diagnosing this. It is a very vexing problem!

-Charlie

I forgot to mention above that, when I switched to the “Legendre” interpolation formula, THE EXACT SAME POINTS produced the same #NUM! error.

Maybe this is a hint about the source of the problem…

-Charlie

Ooops, a correction… the “Legendre” formula seems to work just fine. I discovered an error that I had made when entering it. The Spline still does not work, however.

I have created a spreadsheet with the data mentioned above, and a correction to the “problem” data points to help diagnose the problem. It contains the SPLINE and LEGENDRE interpolation formulas, and shows that the SPLINE fails when one of the problem x values is entered.

You can download it at:

Thanks for any help!!!!

-Charlie

Hi CharlieL,

Alas, this particular formula for this particular cubic spline indeed fails for part of your dataset.

That segment of your Catmull-Rom spline happens not to be monotonic for predicting Y from X. This causes the IRR() formula to fail rather interestingly in at least 5 subintervals of that segment’s range in X, for X values that are monovalent instead of trivalent!?

While I was able to calculate and chart the real Catmull-Rom spline directly from its defnition as a t-function-weighted sum of the two nearest points in each direction in X, you possibly don’t want a spline that behaves this way to fit your dataset. Perhaps a natural cubic spline or a monotonically constrained cubic hermite spline would be a better choice? The Newton Excel Bach blog has done a wonderful job setting up VBA code to let Excel users explore a variety of different cubic splines.

PS — Far worse than not being monotonic, it’s not monovalent. This type of spline can reverse direction in a way that gives multiple Y values for the same X value, and the IRR formula doesn’t work in some subsections of segments with that behavior. Wrong spline if you need a unique Y=f(X) fit.

Hi DaleW,

Thanks for posting an explanation of the problem. I do need a worksheet-formula-only way to do the interpolation (e.g. no VBA or macros), and I would like a smooth curve a la spline or similar. For now I have switched to the Legendre formula, and that seems to work OK for the type of data I am generating. I am still learning about how to implement matrix multiplication via the MMULT function, and I am not up to coming up with something new myself at this point, although I would like to learn how to do it. Also, it’s not clear to me whether the problem that I described in my post was due to a limitation of the Catmull-Rom spline itself, or the implementation via the formula. Can you please elaborate on this?

I need the interpolated line to go through the points in the data set. You mentioned that this is the “wrong” spline – how about a B-Spline? For instance, the Uniform Cubic B-Spline described here:

http://en.wikipedia.org/wiki/B-spline#Uniform_cubic_B-splines

has an explicit matrix form that seems to lend itself well to this approach (from what I can tell). Can you (or someone else reading this post) help me implement it using IIR and MMULT?

Thanks!

-Charlie

Charlie,

B-Splines don’t go through their control points, which is inconvenient when you want to interpolate actual data.

Catmull-Rom splines do interpolate their control points, but they aren’t constrained by the concept that there should be one and only one Y value for each X value. I believe some IRR() problems have more than one rate of return that solves the constraints, but a “well-defined” mathematical function can only return one value for a given value of X. Thus IRR() fails to capture parts of some Catmull-Rom spline segments, which a weighted sum of the control points in terms of a t parameter going from 0 to 1 in each segment does reveal.

Assuming you’d like a “well-defined” and smooth continuous interpolating function for Y on X, I would recommend the classic solution of natural cubic splines. Usually this is implemented in programming languages, and I’ve got more than one VBA solution squirreled away. I don’t imagine that solving the sparse symmetric matrix equations for under 50 points would be that hard to program using the cells of a spreadsheet, if one were so inclined.

The Legendre formula is the cubic polynomial solution, or the simplest natural cubic spline case. The Legendre formula will not give you continuous derivatives if you use it for more than 4 total points. In one mathematical sense, natural cubic splines offer the smoothest possible interpolation. Natural cubic splines aren’t likely to be used for graphics because it isn’t a local solution like Legendre or Catmull-Rom, and every point instead of just 4 local points can have some impact every segment of the spline, making it more computationally intensive.

Hi DaleW,

Now that I checked into it further, you are correct about B-splines not passing though each data point. A cubic Spline would be the best but my data set can be up to 600 points, and doing the large matrix for that would be too much overhead. The Lagrange interpolation seems to be “good enough” for me, despite the occasional cusp in the interpolation where there is a derivative discontinuity.

-Charlie

Charlie,

Good enough!

Incidentally, your dataset can be used to show that Excel must modify the Catmull-Rom spline algorithm to draw its smoothed lines, because Excel smoothed lines usually overlap that spline yet avoid the pathology that you found. As Answers.Microsoft notes, “An adjustment to the basic method is applied if consecutive points are a factor of three or more apart so as to reduce the amount by which the curve overshoots points.” I interpret this as a sort of false position method, where any outer control point that further that 3x the distance between the inner control points is pulled closer. Thus on your troublesome (X,Y) interval between (4.895951461, -484.5497172) and (4.949982361, -513.475427), the next point is treated as if it were closer at (5.2305,-600.2523) for the purposes of the Catmull-Rom algorithm on that segment only, which removes the strange behavior and restores single-valued functionality and lets the IRR() formula work, as well as duplicates the Excel smoothed line exactly on my plot. This might be more trouble than it is worth for you to duplicate Excel’s native interpolating spline — but it would be a lot easier than solving the tridiagonal matrix equation for natural cubic splines. You’d just need to create a table of the 4 XY control points for each of N-3 inner segments where you say your full table is N=600, and the Catmull-Rom OFFSET logic would have to point to that table. This would provide C1 continuity (of 1st derivatives), but not the C2 continuity (of 2nd derivatives also) that is provided by natural cubic splines. Your Legendre solution only has C0 continuity (of the function itself) at the data points (but C2 continuity elsewhere like all cubic splines within each segment, not necessarily between segments).

BTW, are you sure you want spline interpolation instead of LOESS smoothing?

Dale –

Brian Murphy of UT Austin worked out bezier algorithms that seem to have been used for smoothing series lines in Excel charts. I think the details can be found at xlrotor.com.

Jon – yes, I saw that. I believe that the Bezier algorithms may be very similar, but MS Excel actually uses modified Catmull-Rom splines, per the July 4, 2011 post by Answers.Microsoft in response to Tushar Mehta. The VBA code is there, but I think that I did duplicate quickly the Excel charting smooth curve for this troublesome case by using a false position for a far outer control point as I tried to describe using both Catmull-Rom formulas (the quick IRR() one discussed on your blog, and the formal definition in terms of weighted sums of the 4 basis points using t-parameter polynomials).

Charlie (and Jon),

Those Bezier algorithms match Excel’s smoothing to within a few pixels for the troublesome segment. They are in VBA and parametric, so it might take quite a bit of work to implement them to evaluate a particular X value without VBA.

Without using VBA, I picked up on your idea and implemented Ihem’s IRR() and MMULT() formula for Catmull Rom with more or less the modification suggested by Lori Miller for highly unevenly spaced data, and it looks good to within a pixel or two compared to Excel’s native curve fitting, even for your segment where simple Catmull Rom misbehaved. The key formula to find the adjusted spline Y for any X had to be switched to an array formula to work for me:

{=SUM((1+1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1},(TRANSPOSE(OFFSET(xQuad,$Q5-1,0,1,4))-$O5))))^-{0;1;2;3}*MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1},TRANSPOSE(OFFSET(yQuad,$Q5-1,0,1,4))))/2}

The transpose was necessary with matrix logic to accommodate a table (starting at xQuad and then yQuad) of possibly linear-interpolation adjusted outer influence points created by IF logic, with 4 XY points defined in 8 columns shortly after each inner interval as defined by the original XY data columns in A & B. None of this extra effort is needed if the X data is roughly evenly spaced, but your data wasn’t. The Q column holds the Index from Match(), again with slight modification for the endpoints. The O column held the arbitrary X points to evaluate.

CharleyL – Interesting example and glad the suggested formulas have been useful to you. Strangely, in my Excel 2010 version the spline formula does actually return values for the cases highlighted but further investigation reveals this is due to improvements in the IRR implementation in this version. I find the formula matches the curve for the most part but there are a few spurious values where it is picking up another root. An alternative is to use the more natural version of the formula with 1/IRR(MMULT({0,0,2,0;0,1,0,-1;-1,4,-5,2;1,-3,3,-1} replaced by IRR(MMULT({0,2,0,0;-1,0,1,0;2,-5,4,-1;-1,3,-3,1} which does return the correct values for your example. I seem to remember the point of using 1/IRR of the reversed matrix was related to avoiding 0^0 errors when the formula was evaluated exactly at a data point but these can be easily dealt with by adding a negligible constant like 1E-99. Note also the formula is for a 2D spline but could also be modified to a 1D version which would ensure single-valued results. In many cases where data is fairly evenly spaced there won’t be a large difference between these two methods.

DaleW – Nice analysis. The modification that the charting algorithm makes can be seen in various ways. Suppose there are four consecutive points A,B,C,D and you want to construct the curve segment between B and C. In terms of Hermite splines the tension parameter implemented by the chart algorithm is chosen to be t=max(3,|AC|/|BC| ,|BD|/|BC|)/6 (based on this definition of tension and where |.| denotes distance between points as measured on the chart ). In terms of Bezier curves this condition enforces that the control points are never more than a distance |BC|/2 from B or C so do not cross each other. For most normal cases where points are fairly evenly spaced this reduces to the Catmull-Rom case (t=1/2) . Doug Jenkins has many more useful posts and VBA functions on these topics which are well worth looking at as you say – there are a few formulas and comments related to the connection with natural splines are here.

Jon – The xlrotor code is normally fairly close but can deviate quite substantially in some cases. For example entering these (x,y) values: (9,8) (10,12) (11,12) (14,6) in the Bezier example file link gives clearly incorrect values for 10 < x < 11. The vba code posted at answers.microsoft may be the first accurate algorithm available publicly.

Hi Jon Peltier et al.-

I’m not a math expert. But after trying to fit your equations for the last 2 hrs, finally I gave up! And really don’t understand why the interpolation is not working for my database (excel file link below)

http://www.mediafire.com/view/?hdpdpu641qdpxak

I would be very glad if I find any solution. Thanks- Rajib

There was no data in the posted worksheet, so I couldn’t interpolate values either.

Sorry Jon Peltier, There was something wrong. Please find the data in excel as well as in a tab delimited .txt file below:

http://www.mediafire.com/view/?1bo9he3c3jz05u7

http://www.mediafire.com/view/?tmyuta23ajjj27i

Best/

Rajib

The simple linear interpolation methodology worked for me. I used the equations to linearly interpolate y values for a column of x values.

Many Thanks

I’ve been trying to get this to work with a very small set of yearly data so:

X = year in the format 2001

Y = a price

I have this data for 12 years

i want to estimate a monthly value in any of those twelve years.

However, i’m struggling to do this with your spreadsheet.

For example i tried 2001.083 which could predicate a month

however i get a ref error in A2 and i dont think this is the best way of doing it.

Ultimately i want to try and estimate each monthly figure based on a set of yearly final figure. So the value of months 1 through 12 should equal the current yearly figure.

Hope this makes sens.e

Cell A2 should have no error, because it’s the cell you enter the X value for which you want to determine a Y value. So with 2001.083 in A2 you should get a value that is 1/12 of the way from the 2001 value to the 2002 value in cell B2.

Also, this sentence doesn’t make sense if you’re trying to estimate monthly values:

“So the value of months 1 through 12 should equal the current yearly figure.”

Pl indicate Excel program (MAC version) to use the Piece-wise Hermite Interpolation.

Ravi –

Try Google.

Clear, concise, and exactly what I needed. Thanks a lot – HJ

hi

is this formula in excel uses for quadratic spline or another interpolation?

Satria –

The interpolation formulas I introduced in the post are simple linear interpolations.

In the discussion, several commenters cited other kinds of fit, including splines.

Please any one can do it for me or tell me how to do this

regards

maak_88@yahoo.com

y x X Values to be interpolated

1 99.72183534 99.56709957

2 99.4428926 99.13793103

3 99.1631652 98.71244635

4 98.88264649 98.29059829

5 98.60132972 97.87234043

6 98.31920803 97.45762712

7 98.03627447 97.0464135

8 97.75252199 96.63865546

9 97.46794345 96.23430962

10 97.18253158 95.83333333

11 96.89627902 95.43568465

12 96.60917831 95.04132231

13 96.32122185 94.65020576

14 96.03240194 94.26229508

15 95.74271078 93.87755102

16 95.45214042 93.49593496

17 95.16068282 93.11740891

18 94.86832981 92.74193548

19 94.57507306 92.36947791

20 94.28090416 92

21 93.98581453 91.63346614

22 93.68979548 91.26984127

23 93.39283817 90.90909091

24 93.09493363 90.5511811

25 92.79607271 90.19607843

26 92.49624617 89.84375

27 92.19544457 89.49416342

28 91.89365835 89.14728682

29 91.59087776 88.8030888

30 91.28709292 88.46153846

31 90.98229376 88.12260536

32 90.67647006 87.78625954

33 90.36961141 87.45247148

34 90.06170724 87.12121212

35 89.75274679 86.79245283

36 89.4427191 86.46616541

37 89.13161305 86.1423221

38 88.8194173 85.82089552

39 88.50612032 85.50185874

40 88.19171037 85.18518519

41 87.87617551 84.87084871

42 87.55950358 84.55882353

43 87.24168219 84.24908425

44 86.92269874 83.94160584

45 86.60254038 83.63636364

46 86.28119404 83.33333333

47 85.95864639 83.03249097

48 85.63488386 82.73381295

49 85.30989261 82.43727599

50 84.98365856 82.14285714

51 84.65616733 81.85053381

52 84.32740427 81.56028369

53 83.99735446 81.27208481

54 83.66600265 80.98591549

55 83.33333333 80.70175439

56 82.99933065 80.41958042

57 82.66397845 80.13937282

58 82.32726023 79.86111111

59 81.98915917 79.58477509

60 81.64965809 79.31034483

61 81.30873945 79.03780069

62 80.96638534 78.76712329

63 80.62257748 78.49829352

64 80.27729719 78.23129252

65 79.93052539 77.96610169

66 79.58224258 77.7027027

67 79.23242883 77.44107744

68 78.88106377 77.18120805

69 78.5281266 76.92307692

70 78.173596 76.66666667

71 77.8174502 76.41196013

72 77.45966692 76.1589404

73 77.10022337 75.90759076

74 76.73909622 75.65789474

75 76.37626158 75.40983607

76 76.01169501 75.16339869

77 75.64537145 74.91856678

78 75.27726527 74.67532468

79 74.90735018 74.43365696

80 74.53559925 74.19354839

81 74.16198487 73.95498392

82 73.78647874 73.71794872

83 73.40905182 73.48242812

84 73.02967433 73.24840764

85 72.64831573 73.01587302

86 72.26494463 72.78481013

87 71.87952884 72.55520505

88 71.4920353 72.32704403

89 71.10243003 72.10031348

90 70.71067812 71.875

91 70.3167437 71.65109034

92 69.92058988 71.42857143

93 69.52217872 71.20743034

94 69.12147118 70.98765432

95 68.71842709 70.76923077

96 68.31300511 70.55214724

97 67.90516262 70.33639144

98 67.49485577 70.12195122

99 67.08203932 69.90881459

100 66.66666667 69.6969697

101 66.24868971 69.48640483

102 65.82805886 69.27710843

103 65.4047229 69.06906907

104 64.97862897 68.86227545

105 64.54972244 68.65671642

106 64.11794687 68.45238095

107 63.68324392 68.24925816

108 63.2455532 68.04733728

109 62.80481227 67.84660767

110 62.36095645 67.64705882

111 61.91391874 67.44868035

112 61.46362972 67.25146199

113 61.01001739 67.05539359

114 60.55300708 66.86046512

115 60.09252126 66.66666667

116 59.6284794 66.47398844

117 59.16079783 66.28242075

118 58.68938954 66.09195402

119 58.21416399 65.9025788

120 57.73502692 65.71428571

121 57.25188012 65.52706553

122 56.76462122 65.34090909

123 56.27314339 65.15580737

124 55.7773351 64.97175141

125 55.27707984 64.78873239

126 54.77225575 64.60674157

127 54.26273532 64.42577031

128 53.74838499 64.24581006

129 53.22906474 64.06685237

130 52.70462767 63.88888889

131 52.17491947 63.71191136

132 51.63977795 63.5359116

133 51.09903239 63.36088154

134 50.55250296 63.18681319

135 50 63.01369863

136 49.44132325 62.84153005

137 48.876261 62.67029973

138 48.30458915 62.5

139 47.72607021 62.33062331

140 47.14045208 62.16216216

141 46.54746681 61.99460916

142 45.94682917 61.82795699

143 45.33823503 61.66219839

144 44.72135955 61.4973262

145 44.09585518 61.33333333

146 43.46134937 61.17021277

147 42.81744193 61.00795756

148 42.16370214 60.84656085

149 41.49966533 60.68601583

150 40.82482905 60.52631579

151 40.1386486 60.36745407

152 39.44053189 60.20942408

153 38.72983346 60.05221932

154 38.0058475 59.89583333

155 37.26779962 59.74025974

156 36.51483717 59.58549223

157 35.74601765 59.43152455

158 34.96029494 59.27835052

159 34.15650255 59.12596401

160 33.33333333 58.97435897

161 32.48931448 58.82352941

162 31.6227766 58.67346939

163 30.73181486 58.52417303

164 29.8142397 58.37563452

165 28.86751346 58.2278481

166 27.88866755 58.08080808

167 26.87419249 57.93450882

168 25.81988897 57.78894472

169 24.72066162 57.64411028

170 23.57022604 57.5

171 22.36067977 57.35660848

172 21.08185107 57.21393035

173 19.72026594 57.0719603

174 18.25741858 56.93069307

175 16.66666667 56.79012346

176 14.90711985 56.65024631

177 12.90994449 56.51105651

178 10.54092553 56.37254902

179 7.453559925 56.23471883

180 0 56.09756098

55.96107056

55.82524272

55.69007264

55.55555556

55.42168675

55.28846154

55.1558753

55.02392344

54.89260143

54.76190476

54.63182898

54.50236967

54.37352246

54.24528302

54.11764706

53.99061033

53.86416862

53.73831776

53.61305361

53.48837209

53.36426914

53.24074074

53.11778291

52.99539171

52.87356322

52.75229358

52.63157895

52.51141553

52.39179954

52.27272727

52.15419501

52.0361991

51.91873589

51.8018018

51.68539326

51.56950673

51.4541387

51.33928571

51.22494432

51.11111111

50.99778271

50.88495575

50.77262693

50.66079295

50.54945055

50.43859649

50.32822757

50.21834061

50.10893246

50

49.89154013

49.78354978

49.67602592

49.56896552

49.46236559

49.35622318

49.25053533

49.14529915

49.04051173

48.93617021

48.83227176

48.72881356

48.62579281

48.52320675

48.42105263

48.31932773

48.21802935

48.11715481

48.01670146

47.91666667

47.81704782

47.71784232

47.61904762

47.52066116

47.42268041

47.32510288

47.22792608

47.13114754

47.03476483

46.93877551

46.84317719

46.74796748

46.65314402

46.55870445

46.46464646

46.37096774

46.277666

46.18473896

46.09218437

46

Maak –

I presume you want a new Y value for each X value in your second list.

I put the pairs of X and Y into X-Y order, not Y-X order (columns A:B, see screenshot below).

I put the new X values into another column E.

I used column D for an index, which points to the value in the column of input X values closest to but greater than the new X value. The formula in D2 is as follows, and copied down as far as there are values in column E:

=MATCH(E2,$A$2:$A$181,-1)

The third argument, -1, is needed because the input X values are in decreasing order.

I used column F for the interpolated Y values. The formula in F2 is as follows, copied down as far as needed:

=(E2-INDEX($A$2:$A$181,D2))/(INDEX($A$2:$A$181,D2+1)-INDEX($A$2:$A$181,D2))*(INDEX($B$2:$B$181,D2+1)-INDEX($B$2:$B$181,D2))+INDEX($B$2:$B$181,D2)

It’s basically the same formula as in cell B2 of the example.

Hi,

I see that the graph never fall back (it ascends). But what if when the curve falls back (quadratic – so that each y value will have two x values). How can we interpolate for x for a given y value? Given: y=0.61357555

The correct answer is x=867.5618709 with this formula: =A20+((L3-B20)/(B21-B20))*(A21-A20). But I don’t need to manually search for the lower and upper bounds of the given value in y and corresponding x. I need the first (smaller x value).

Thanks,

Here is the data.

y x

37.18656

0.255633 103.88304

0.250945 188.28446

0.250528 262.19693

0.259958 375.86513

0.327046 492.14553

0.429768 630.57233

0.584134 782.98033

0.645869 960.33653

0.640969 1137.91573

0.610922 1296.88933

0.588545 1448.95363

0.598201 1595.24113

0.586454 1767.37863

0.579821 1920.10453

0.513784 2063.63463

0.509153 2206.45073

0.515956 2366.94803

0.470963 2507.86673

0.440226 2668.59793

0.415335 2784.40273

0.357285 2873.33485

0.338649 2966.22146

0.331126 3032.07235

0.331747 3082.98812

0.299145 3146.48242

0.291289 3222.21814

0.287335 3268.67082

37.18656 is under X while the corresponding Y value is missing.

Hi can you help me in this problem, I need to determine the values between minimun x =1 and the maximum x = -1,1 y = 46,99 and maximum y =139…my result it is a linear chart and should be a curve…I tried to determine the values like a progression but, it is clear it ´snt the way…

x_Coord1 = 1 and x_Coord2 = -1,1

y_Coord1 = 46,99 and y_Coord2= 139

x_Coord1= 2 and x_Coord2= 9,8

y_Coord1 = 44,23 and y_Coord2= 218

x_Coord1 3 and x_Coord2 10,4

x_Coord1 4 and x_Coord2 9,2

x_Coord1 5 and x_Coord2 7,5

x_Coord1 6 and x_Coord2 8,8

x_Coord1 7 and x_Coord2 8

y_Coord1 33,96 and y_Coord2 174

y_Coord1 30,36 and y_Coord2 192

y_Coord1 27,69 and y_Coord2 184

y_Coord1 23,84 and y_Coord2 177

y_Coord1 18,86 and y_Coord2 165

I found a solution in the comments, thanks

I understand the given problem. I’m trying to run this operation in an x-y grid and cannot seem to figure it out… Any tips?

Elaboration…

This is a stream function on a grid showing a numerical value at each (x,y) coordinate. The blue line represents a streamline that starts at 0.1 and I’m tracing it above the numbers in that row as the value goes down 0.1’s relationship is higher. Now, this is just me eyeballing it. I want to interpolate at 0.1 to find the exact values of this line…

I’m looking for some kind of contours of equal value, which is what interpolation could help with, but that’s not apparently what you’re trying to draw.

Actually, the contour of equal value would be the same thing as the streamline.

But that’s not a line of equal value, at least of the values in the grid. In the first column, the line is at a value between 0.02 and 0.04. In the second it’s between 0.01 and 0.02. In the third it’s between -0.01 and -0.01. But there can be no value between these three pairs of limits, since they do not overlap. This is why I don’t understand what you are trying to plot.

Hi Jon

Does this formula work, if one the axis in logarithemnic scale to base 10? I have Particle size on X axis with logaritamic scale and %Fine on Y axis. I want to interpolate value on x-axis for a know point on Y axis (say 60).

Particle Size %Finer

4.75 65.52

2 50.98

1.18 36.2

0.6 30.34

0.425 19.78

0.3 9.3

0.15 4.02

0.075 0.2

Would be greatful, if you could help.

Than you

Anant –

Good point. Interpolation gives us the coordinates of a point on a straight line between two known points. But we get a different straight line depending on our coordinate system. The straight line we get using linear X and Y (blue in the charts below) is not the same as the straight line we get when our X axis is logarithmic (orange). These are the last two points of your input data, sorted in increasing order or particle size.

We need to use a modification to our formula if we want to capture the logarithmic straight line. Below are three sets of data: Your input data sorted by increasing particle size (top), three particle sizes and their % finer values calculated using linear formulas (blue), and the same particle sizes and their % finer values calculated using logarithmic formulas (orange).

Formulas in the blue (linear) table are as follows, filled down to compute all three values:

B14:

`=MATCH(C14,$C$3:$C$10)`

D14:

`=INDEX(D$3:D$10,$B14)+(C14-INDEX($C$3:$C$10,B14))*`

(INDEX(D$3:D$10,B14+1)-INDEX(D$3:D$10,B14))/

(INDEX(C$3:C$10,B14+1)-INDEX(C$3:C$10,B14))

Formulas in the orange (logarithmic) table are:

B20:

`=MATCH(C20,$C$3:$C$10)`

D20:

`=INDEX(D$3:D$10,$B20)+(LOG($C20)-LOG(INDEX($C$3:$C$10,$B20)))*`

(INDEX(D$3:D$10,$B20+1)-INDEX(D$3:D$10,$B20))/

(LOG(INDEX($C$3:$C$10,$B20+1))-LOG(INDEX($C$3:$C$10,$B20)))

All of the particle size measurements are converted to their logarithms in the interpolation formula.

The charts below show how the blue (linear) points fall on the straight lines of the linear chart (left) while the orange (log) points fall on the straight lines of the logarithmic plot (right).

Has anyone gotten this formula to work in the latest version of excel? I have copied and pasted the exact formula and the exact data into the same cells and I keep getting an error message that long story short, tells me that the formula is messed up in some way.

CG –

There’s no reason any valid formula in one version of Excel will not work in a later version.

Which formula is it? How is it entered? What error message are you getting? In which earlier version of Excel was it working.

THANK YOU