In Get a Maximum and Minimum Value from Certain Charts, John Mansfield showed a simple routine he uses to find the maximum and minimum Y values in a chart. In the ensuing discussion, alternatives were suggested with various justifications. I wanted to say that this one was faster than that one, but I realized that I didn’t know what was fastest. So I decided a series of time trials were called for.
Experimental Design
There are two ways to find the minimum and maximum of a set of values:
- Loop through the values, and keep track of the largest and smallest you’ve encountered
- Use the worksheet functions MIN() and MAX() (i.e., WorksheetFunction.Min and .Max)
There are three ways to obtain the values for each series:
- Use the .Values property of the series, which returns a 1-D array
- Parse the series formula to identify the range, then get the .Value of this range object, which returns a 2-D array
- Parse the series formula to identify the range, and analyze the range itself
This suggests a 2×3 matrix, but I eliminated looping cell-by-cell through the range to find the largest and smallest values, since experience shows that looping through the cells of a range in this way is much slower than using the .Value property of the range to generate an array, and working with this array.
The five-block test matrix becomes:
- Looping Over Series Values
- Worksheet Function on Series Values
- Looping Over Range Values
- Worksheet Function on Range Values
- Worksheet Function on Range Object
I wrote VBA procedures to perform each of these approaches. Then I wrote a master procedure that called each of these procedures 1000 times, 50 times each in 20 blocks, so I’d have the mean time for each block. The overall mean and standard deviation of the block means can be used to compare the different approaches.
I used a simple XY chart with 1000 points. Both X and Y values were taken from the standard normal distribution. The procedures below could be adapted to check the Y values of any chart and the X values of an XY chart or a line/column/bar/area chart with a date scale X axis. Here the X values don’t matter since I was merely sampling Y.
I ran the tests in Excel 2003 SP3 on a reasonably modern laptop, running Windows 7 Ultimate with 4GB of memory on a 2.20 GHz dual-core AMD processor.
VBA Procedures
Looping Over Series Values
Sub Max_Min_Loop_Values()
Dim Cht As Chart
Dim Srs As Series
Dim i As Long
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
A = Srs.Values
MaxVal = A(LBound(A, 1))
MinVal = A(LBound(A, 1))
For i = LBound(A, 1) + 1 To UBound(A, 1)
If A(i) > MaxVal Then MaxVal = A(i)
If A(i) < MinVal Then MinVal = A(i)
Next i
End Sub
Assuming there is one chart on the active worksheet, and one series in the chart, we set the chart and the series simply. We assign the Y values of the series to an array, and initially set the minimum and maximum values to the first value in the array. Then we loop from the second to the last values in the array, and if a given value is larger than the maximum value so far, we set the new maximum to the current value, and likewise for the minimum. I didn’t output the minimum and maximum values, all I wanted was the time to run the core of the procedure.
Worksheet Function on Series Values
Sub Max_Min_Wkfn_Values()
Dim Cht As Chart
Dim Srs As Series
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
A = Srs.Values
With WorksheetFunction
MaxVal = .Max(A)
MinVal = .Min(A)
End With
End Sub
We set the chart and the series simply as above. We assign the Y values of the series to an array, and use the worksheet functions Min() and Max() to determine the minimum and maximum values.
Looping Over Range Values
Sub Max_Min_Loop_RangeArray()
Dim Cht As Chart
Dim Srs As Series
Dim i As Long
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
A = Range(sSeriesFmla).Value
MaxVal = A(LBound(A, 1), 1)
MinVal = A(LBound(A, 1), 1)
For i = LBound(A, 1) + 1 To UBound(A, 1)
If A(i, 1) > MaxVal Then MaxVal = A(i, 1)
If A(i, 1) < MinVal Then MinVal = A(i, 1)
Next i
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, use the .Value property of the range to define an array, and initially set the minimum and maximum values to the first value in the array. Then we loop from the second to the last values in the array, and if a given value is larger than the maximum value so far, we set the new maximum to the current value, and likewise for the minimum.
Worksheet Function on Range Values
Sub Max_Min_Wkfn_RangeArray()
Dim Cht As Chart
Dim Srs As Series
Dim A As Variant
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
A = Range(sSeriesFmla).Value
With WorksheetFunction
MaxVal = .Max(A)
MinVal = .Min(A)
End With
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, use the .Value property of the range to define an array, and use the worksheet functions Min() and Max() to determine the minimum and maximum values.
Worksheet Function on Range Object
Sub Max_Min_Wkfn_Range()
Dim Cht As Chart
Dim Srs As Series
Dim rng As Range
Dim MinVal As Double
Dim MaxVal As Double
Dim sSeriesFmla As String
Set Cht = ActiveSheet.ChartObjects(1).Chart
Set Srs = Cht.SeriesCollection(1)
sSeriesFmla = Split(Srs.Formula, ",")(2)
Set rng = Range(sSeriesFmla)
With WorksheetFunction
MaxVal = .Max(rng)
MinVal = .Min(rng)
End With
End Sub
We set the chart and the series simply as above. We parse the formula of the series to find the range, and use the Worksheet Functions Min() and Max() to determine the minimum and maximum values of the range object directly.
Timing Procedure
Sub TestTimes()
Dim Looper As Long
Dim Blocker As Long
Dim tStart As Double
Dim tTotal As Double
Const LooperMax As Long = 50
Const BlockerMax As Long = 20
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Loop_Values
Next
tTotal = Timer - tStart
WriteToLog "Loop - Values: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_Values
Next
tTotal = Timer - tStart
WriteToLog "Func - Values: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Loop_RangeArray
Next
tTotal = Timer - tStart
WriteToLog "Loop - RngArr: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_RangeArray
Next
tTotal = Timer - tStart
WriteToLog "Func - RngArr: " & tTotal & " sec for " & LooperMax & " iterations"
Next
For Blocker = 1 To BlockerMax
tStart = Timer
For Looper = 1 To LooperMax
Max_Min_Wkfn_Range
Next
tTotal = Timer - tStart
WriteToLog "Func - Range: " & tTotal & " sec for " & LooperMax & " iterations"
Next
End Sub
This procedure calls each of the Min_Max procedures LooperMax (50) times, records the total time for 50 iterations, and repeats this BlockerMax (20) times for each procedure, before moving on to the next. The times are recorded in a text file using the WriteToLog procedure below.
Logging Procedure
Public Sub WriteToLog(sLogEntry As String)
' write information to a log file
Dim iFile As Integer
Dim sFileName As String
Const sLogFile As String = "TimeLogger"
sFileName = ThisWorkbook.Path & "\" & sLogFile & Format$(Now, "YYMMDD") & ".txt"
iFile = FreeFile
Open sFileName For Append As iFile
Print #iFile, Now; " "; sLogEntry
Close iFile
End Sub
This is a very handy procedure I use all the time for logging debug information or for saving settings. It uses good old VB I/O protocols, which are very fast, to populate a text file.
Test Results
The results of these experiments are shown in the following table. The approaches are sorted from slowest to fastest. Processing an array extracted from the source data range takes about 1.2 milliseconds, whether looping or using the worksheet functions. Processing the series values array takes about 0.7 milliseconds by either technique. Measuring the range directly takes less than 0.5 milliseconds.
50-iteration means of the 20 blocks for each approach are plotted in this chart, with means sorted from fastest to slowest to give a pseudo-distribution of procedure times.
This chart is stepped, because there seems to be a discrete quantum of time, that is, a resolution of about 0.078 milliseconds in the measurement of time using VBA’s Timer function on this microprocessor.
What These Results Mean
One school of thought is that looping an array would be slow compared to a single function call, while another is that any call from VBA to Excel (i.e., WorksheetFunction) would incur a performance hit. An unexpected result is that on this machine, it makes essentially no difference whether I use a loop or worksheet functions to find the minimum and maximum values in a series of 1000 points. I guess that’s why we occasionally bother to run these time trials.
We see that using the array of series values is faster than using the array of values from the source data range, by not quite a factor of two. Thinking of the VBA-Excel interface, we have to cross it once to get the series values, or twice to get the series formula and again to get the range values. This makes sense, and I have to admit, I would never have thought of getting the values from the source data range, since they are so easy to extract using series .Values.
The fastest method of all, which was proposed by Daniel Ferry (Excel Hero), is to parse the series formula to find the range, and directly measure the min and max of this range. Apparently the performance hit to cross the VBA-Excel barrier is offset by the much faster performance of worksheet functions operating on Excel objects directly within Excel.
Although the direct range object measurement approach is fastest, I will stick to my series values approach. The main reason is that not all series values come from a single range in a worksheet, or even from a worksheet range at all: some series use multiple area ranges, while others have values hard-coded as a literal array directly in the series formula.
This is a “normal” series formula (based on the first ten points of experimental data from our chart above):
=SERIES(Sheet1!$B$1,Sheet1!$A$2:$A$11,Sheet1!$B$2:$B$11,1)
In this formula, the series name is the argument in green, X values are in purple, Y values are in blue, and plot order is in red. These are the same colors Excel uses to highlight the data ranges of a selected series in the underlying worksheet (illustrated below).
This formula is for a series that uses multiple area ranges for X and Y values, the areas for X and Y enclosed in parentheses:
=SERIES(Sheet1!$B$1,(Sheet1!$A$2:$A$501,Sheet1!$A$502:$A$1001),
(Sheet1!$B$2:$B$501,Sheet1!$B$502:$B$1001),1)
Multiple area formulas are not too uncommon. Less common is a series whose formula has hard-coded arrays as series values (first ten points of experimental data converted to values using F9 key):
=SERIES("Y",{-0.172902780162854;0.0788276881579948;0.752718316150293;-1.15534740638389;
-0.651271029427549;1.61522977181724;0.218894823127643;-1.38264191435207;
0.107605524088065;0.281018951052876},
{-0.640448122430912;-0.355817827507841;-0.318480200209528;-0.0116013833433011;
-1.15185971606638;0.183605522997797;-0.376897171002227;-0.488810695788606;
-2.14375416629929;0.154502474308393},1)
There is a practical limit of around 250 characters (you’d guess 255, but it’s slightly less) each for the X values and Y values of a series formula. In the formula above, the ten X values consume 183 characters, the ten Y values 188. You can squeeze in more values by reducing the resolution of the values (e.g., -0.173 in place of -0.172902780162854), but that’s just nibbling around the margins. A practical limit is around 17 or 18 values before the formula crashes.
The arrays (within {curly braces} above) are delimited by semicolons in this case, because the original data came from a column, but data in rows would be delimited by commas.
These complications in the series formula lead to errors when using simple approaches to parse such formulas. John Walkenbach has developed A Class Module To Manipulate A Chart Series which can deal with such difficult situations.
Since accessing values using the .Values property is so easy, and since this approach is only about 50% slower than the risky approach of parsing the formula and measuring the range directly, that’s the approach I will stick with.