In Get a Maximum and Minimum Value from Certain Charts, John Mansfield showed a simple routine he uses to find the maximum and minimum Y values in a chart. In the ensuing discussion, alternatives were suggested with various justifications. I wanted to say that this one was faster than that one, but I realized that I didn’t know what was fastest. So I decided a series of time trials were called for.
Experimental Design
There are two ways to find the minimum and maximum of a set of values:
- Loop through the values, and keep track of the largest and smallest you’ve encountered
- Use the worksheet functions MIN() and MAX() (i.e., WorksheetFunction.Min and .Max)
There are three ways to obtain the values for each series:
- Use the .Values property of the series, which returns a 1-D array
- Parse the series formula to identify the range, then get the .Value of this range object, which returns a 2-D array
- Parse the series formula to identify the range, and analyze the range itself
This suggests a 2×3 matrix, but I eliminated looping cell-by-cell through the range to find the largest and smallest values, since experience shows that looping through the cells of a range in this way is much slower than using the .Value property of the range to generate an array, and working with this array.
The five-block test matrix becomes:
- Looping Over Series Values
- Worksheet Function on Series Values
- Looping Over Range Values
- Worksheet Function on Range Values
- Worksheet Function on Range Object
I wrote VBA procedures to perform each of these approaches. Then I wrote a master procedure that called each of these procedures 1000 times, 50 times each in 20 blocks, so I’d have the mean time for each block. The overall mean and standard deviation of the block means can be used to compare the different approaches.
I used a simple XY chart with 1000 points. Both X and Y values were taken from the standard normal distribution. The procedures below could be adapted to check the Y values of any chart and the X values of an XY chart or a line/column/bar/area chart with a date scale X axis. Here the X values don’t matter since I was merely sampling Y.
I ran the tests in Excel 2003 SP3 on a reasonably modern laptop, running Windows 7 Ultimate with 4GB of memory on a 2.20 GHz dual-core AMD processor.
VBA Procedures
Looping Over Series Values
Sub Max_Min_Loop_Values()
Dim Cht As Chart
Dim Srs As Series
Dim i As Long
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
A = Srs.Values
MaxVal = A(LBound(A, 1))
MinVal = A(LBound(A, 1))
For i = LBound(A, 1) + 1 To UBound(A, 1)
If A(i) > MaxVal Then MaxVal = A(i)
If A(i) < MinVal Then MinVal = A(i)
Next i
End Sub
Assuming there is one chart on the active worksheet, and one series in the chart, we set the chart and the series simply. We assign the Y values of the series to an array, and initially set the minimum and maximum values to the first value in the array. Then we loop from the second to the last values in the array, and if a given value is larger than the maximum value so far, we set the new maximum to the current value, and likewise for the minimum. I didn’t output the minimum and maximum values, all I wanted was the time to run the core of the procedure.
Worksheet Function on Series Values
Sub Max_Min_Wkfn_Values()
Dim Cht As Chart
Dim Srs As Series
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
A = Srs.Values
With WorksheetFunction
MaxVal = .Max(A)
MinVal = .Min(A)
End With
End Sub
We set the chart and the series simply as above. We assign the Y values of the series to an array, and use the worksheet functions Min() and Max() to determine the minimum and maximum values.
Looping Over Range Values
Sub Max_Min_Loop_RangeArray()
Dim Cht As Chart
Dim Srs As Series
Dim i As Long
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
A = Range(sSeriesFmla).Value
MaxVal = A(LBound(A, 1), 1)
MinVal = A(LBound(A, 1), 1)
For i = LBound(A, 1) + 1 To UBound(A, 1)
If A(i, 1) > MaxVal Then MaxVal = A(i, 1)
If A(i, 1) < MinVal Then MinVal = A(i, 1)
Next i
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, use the .Value property of the range to define an array, and initially set the minimum and maximum values to the first value in the array. Then we loop from the second to the last values in the array, and if a given value is larger than the maximum value so far, we set the new maximum to the current value, and likewise for the minimum.
Worksheet Function on Range Values
Sub Max_Min_Wkfn_RangeArray()
Dim Cht As Chart
Dim Srs As Series
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
A = Range(sSeriesFmla).Value
With WorksheetFunction
MaxVal = .Max(A)
MinVal = .Min(A)
End With
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, use the .Value property of the range to define an array, and use the worksheet functions Min() and Max() to determine the minimum and maximum values.
Worksheet Function on Range Object
Sub Max_Min_Wkfn_Range()
Dim Cht As Chart
Dim Srs As Series
Dim rng As Range
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
Set rng = Range(sSeriesFmla)
With WorksheetFunction
MaxVal = .Max(rng)
MinVal = .Min(rng)
End With
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, and use the Worksheet Functions Min() and Max() to determine the minimum and maximum values of the range object directly.
Timing Procedure
Sub TestTimes()
Dim Looper As Long
Dim Blocker As Long
Dim tStart As Double
Dim tTotal As Double
Const LooperMax As Long = 50
Const BlockerMax As Long = 20
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Loop_Values
Next
tTotal = Timer - tStart
WriteToLog "Loop - Values: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_Values
Next
tTotal = Timer - tStart
WriteToLog "Func - Values: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Loop_RangeArray
Next
tTotal = Timer - tStart
WriteToLog "Loop - RngArr: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_RangeArray
Next
tTotal = Timer - tStart
WriteToLog "Func - RngArr: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_Range
Next
tTotal = Timer - tStart
WriteToLog "Func - Range: " & tTotal & " sec for " & LooperMax & " iterations"
Next
End Sub
This procedure calls each of the Min_Max procedures LooperMax (50) times, records the total time for 50 iterations, and repeats this BlockerMax (20) times for each procedure, before moving on to the next. The times are recorded in a text file using the WriteToLog procedure below.
Logging Procedure
Public Sub WriteToLog(sLogEntry As String)
' write information to a log file
Dim iFile As Integer
Dim sFileName As String
Const sLogFile As String = "TimeLogger"
sFileName = ThisWorkbook.Path & "\" & sLogFile & Format$(Now, "YYMMDD") & ".txt"
iFile = FreeFile
Open sFileName For Append As iFile
Print #iFile, Now; " "; sLogEntry
Close iFile
End Sub
This is a very handy procedure I use all the time for logging debug information or for saving settings. It uses good old VB I/O protocols, which are very fast, to populate a text file.
Test Results
The results of these experiments are shown in the following table. The approaches are sorted from slowest to fastest. Processing an array extracted from the source data range takes about 1.2 milliseconds, whether looping or using the worksheet functions. Processing the series values array takes about 0.7 milliseconds by either technique. Measuring the range directly takes less than 0.5 milliseconds.
50-iteration means of the 20 blocks for each approach are plotted in this chart, with means sorted from fastest to slowest to give a pseudo-distribution of procedure times.
This chart is stepped, because there seems to be a discrete quantum of time, that is, a resolution of about 0.078 milliseconds in the measurement of time using VBA’s Timer function on this microprocessor.
What These Results Mean
One school of thought is that looping an array would be slow compared to a single function call, while another is that any call from VBA to Excel (i.e., WorksheetFunction) would incur a performance hit. An unexpected result is that on this machine, it makes essentially no difference whether I use a loop or worksheet functions to find the minimum and maximum values in a series of 1000 points. I guess that’s why we occasionally bother to run these time trials.
We see that using the array of series values is faster than using the array of values from the source data range, by not quite a factor of two. Thinking of the VBA-Excel interface, we have to cross it once to get the series values, or twice to get the series formula and again to get the range values. This makes sense, and I have to admit, I would never have thought of getting the values from the source data range, since they are so easy to extract using series .Values.
The fastest method of all, which was proposed by Daniel Ferry (Excel Hero), is to parse the series formula to find the range, and directly measure the min and max of this range. Apparently the performance hit to cross the VBA-Excel barrier is offset by the much faster performance of worksheet functions operating on Excel objects directly within Excel.
Although the direct range object measurement approach is fastest, I will stick to my series values approach. The main reason is that not all series values come from a single range in a worksheet, or even from a worksheet range at all: some series use multiple area ranges, while others have values hard-coded as a literal array directly in the series formula.
This is a “normal” series formula (based on the first ten points of experimental data from our chart above):
=SERIES(Sheet1!$B$1,Sheet1!$A$2:$A$11,Sheet1!$B$2:$B$11,1)
In this formula, the series name is the argument in green, X values are in purple, Y values are in blue, and plot order is in red. These are the same colors Excel uses to highlight the data ranges of a selected series in the underlying worksheet (illustrated below).
This formula is for a series that uses multiple area ranges for X and Y values, the areas for X and Y enclosed in parentheses:
=SERIES(Sheet1!$B$1,(Sheet1!$A$2:$A$501,Sheet1!$A$502:$A$1001),
(Sheet1!$B$2:$B$501,Sheet1!$B$502:$B$1001),1)
Multiple area formulas are not too uncommon. Less common is a series whose formula has hard-coded arrays as series values (first ten points of experimental data converted to values using F9 key):
=SERIES("Y",{-0.172902780162854;0.0788276881579948;0.752718316150293;-1.15534740638389;
-0.651271029427549;1.61522977181724;0.218894823127643;-1.38264191435207;
0.107605524088065;0.281018951052876},
{-0.640448122430912;-0.355817827507841;-0.318480200209528;-0.0116013833433011;
-1.15185971606638;0.183605522997797;-0.376897171002227;-0.488810695788606;
-2.14375416629929;0.154502474308393},1)
There is a practical limit of around 250 characters (you’d guess 255, but it’s slightly less) each for the X values and Y values of a series formula. In the formula above, the ten X values consume 183 characters, the ten Y values 188. You can squeeze in more values by reducing the resolution of the values (e.g., -0.173 in place of -0.172902780162854), but that’s just nibbling around the margins. A practical limit is around 17 or 18 values before the formula crashes.
The arrays (within {curly braces} above) are delimited by semicolons in this case, because the original data came from a column, but data in rows would be delimited by commas.
These complications in the series formula lead to errors when using simple approaches to parse such formulas. John Walkenbach has developed A Class Module To Manipulate A Chart Series which can deal with such difficult situations.
Since accessing values using the .Values property is so easy, and since this approach is only about 50% slower than the risky approach of parsing the formula and measuring the range directly, that’s the approach I will stick with.
dermotb says
It’s worth looking at how you can improve things if you have a single data range, which is by far the most common situation.
You can look up a single data range directly, without the need to parse the chart formula, and this takes quite a bit of time off the last three options. It doesn’t change the order of results, but on my figures, Func-Range wins even more convincingly, and the array-based options are only 10% more than the series-based options – which means there’s one exceptional option, and then the rest. Surprising!
Jon Peltier says
Dermot –
In my case I didn’t presume to know the range ahead of time. Perhaps this is in a clean-up routine used to make existing charts ready for a report.
If your code is creating the chart, then you know what the range is, and can measure it during the generation of the chart. I decided there were two options for this: one if you know the address (“B2:B1001”), the other if you know the name assigned to the range (“dataY”).
The procedure to find min and max of a range with a known address is:
The procedure to find min and max of a range with a known name is:
The resulting table of statistics:
The corresponding chart:
As Dermot points out, testing the range directly without parsing the series formula is much faster. If you test the range by name, it’s almost five times faster than parsing the formula and testing the range. If you test by address, it’s more than eight times faster than parsing and testing, and almost twice as fast as testing the name.
Bob says
Hi Jon,
If you put on your chart busters hat, what would you say about the vertical axis?
Could you not put the units in milliseconds and drop the 0.00?
There is OCD and then there is OCD.
Great post and so informative.
Cheers,
B
Jon Peltier says
Bob – Thanks for pointing that out. I’m sure Naomi will be glad someone beat her to it :-)
J. Mansfield says
Jon,
I really enjoyed this post. In addition to the time trial, I found it very useful that you illustrated five VBA techniques to get at the same result. Illustrating multiple solutions always helps with the learning process. Thanks again for putting this together.
John Mansfield
Jon Peltier says
John –
Trying all these methods is also a good learning experience for me. Learning the results of the time trials is an added bonus.
mikezang says
Hi
I got a strange error when I use code as below.
The error is not on all time, if run again there is no any error.
I have no idea, can you give some suggestion?
Thanks a lot.
With .SeriesCollection(CHART_DATA_INDEX_TB)
.HasDataLabels = True
.ApplyDataLabels ShowValue:=False
For iRow = 1 To giPeriod
if (iRow mod 2 =0) then
With .Points(iRow)
.ApplyDataLabels ShowValue:=True ‘ here error
With .DataLabel
.Text = iRow
End With
End With
end if
Next
End With
Jon Peltier says
Mike –
If it is a line chart or XY chart, and a point is not plotted because the cell is blank or contains the #N/A error, then the code fails because it cannot apply a data label to a nonexistent point.
I don’t know what would only intermittently cause the error. The code is fine, except the two lines after With .SeriesCollection() are unnecessary.
mikezang says
Jon
Thanks for your answer, I got the reason as you said. I set a point value to empty because I don’t want plot 0.
I want to how to if there is any way to not plot a value 0 except to set empty?
mikezang says
Jon
I have another question about DataLabel position. if I use code as below, the position will look lilke
[code]
If (aRatio(iRow, 1) > 0) Then
.Position = xlLabelPositionAbove
Else
.Position = xlLabelPositionBelow
End If
[/code]
I tried to reset most right data label so that I add code as below, but data label is not at right position.
[code]
iTop = .Top
.Position = xlLabelPositionLeft
.Top = iTop
.Left = .Left + 10
[/code]
But if I run code above again, I got what I need as
Do you know any reason for this case?
Jon Peltier says
“set point value to empty” Do you mean you deleted the value in the cell?
If it’s a calculation, use
=IF(formula0,formula,NA())
which returns #N/A in the cell, which line and XY plots will not plot. It looks like bar and column charts also do not plot #N/A, but they plot them as zero.
Jon Peltier says
Mike –
What version of Excel?
Sometimes formatting secondary properties in Excel charts doesn’t work exactly as you’d expect. From Excel 97 through 2003 there were small improvements made each version. Excel 2007 has its own set of idiosyncrasies, which I deal with on a case by case basis (since I haven’t internalized them all yet).
mikezang says
Jon
I am using 2003.
My data is as below, because there are “#N/A(empty)” so that I couldn’t set data label.
Is it possible to set a value except “#N/A” and not plot it?
ChartTB ={#N/A;#N/A;10845;10894.5454559326;10944.0909118652;10993.6363677978;11043.1818237304;11092.727279663;11142.2727355957;11191.8181915283;11241.3636474609;11290.9091033935;11340.4545593261;11390;11335.4166679382;11280.8333358764;11226.2500038146};
Jon Peltier says
Mike –
You have two choices if you want to plot nothing instead of zero, you can use #N/A or you can completely clear all contents of the cell. To avoid an error you could test the value of a point:
mikezang says
Hi, Jon
Thanks for your advise, I will try them.
By the way, I am using a defined NAME range for SeriesCollection, I set Application.Calculation = xlCalculationManual for speed up, but the range will not be updated, so I can’t check the SeriesCollection value correctly, do you have any idea?
Jon Peltier says
Mike –
In order to check the value, you have to calculate the range. Otherwise, how will you know if you have to skip over a point? If you have screen updating turned off, the calculation will not be much of a performance hit.
Faheem says
I am struggling in findng max and min value in charts and format that particular datalabel with the help of condition or loop.
Please provide me code for this..!
Thanks
Jon Peltier says
Faheem – Try this article: Special Format for Minimum and Maximum.