Chart a Wide Range of Values

How do you chart a wide range of values? There are numerous solutions to this, each with pros and cons. I’m showing these in Excel charts, of course, but they apply no matter what charting package you may be using.

It’s easy enough to think of an example. The following two charts show monthly sales data. One month had very high sales, while the rest of the months had low sales. (I’ll be illustrating the concepts in this article using both line and column charts, because each has special considerations.

Chart A Wide Range Of Values

In both charts, the January sales value stands proud, while the other months are hiding in the weeds along the bottom of the chart.

Use a Logarithmic Scale

The first approach to chart a wide range of values was suggested in Logarithmic Scale In An Excel Chart, a tutorial on the MyExcelOnline Excel Blog. The My Excel Online web site is run by my colleague John Michaloudis, and it features lots of great tutorials, podcasts, free training, and paid courses. John’s point was well taken, but it was incomplete. It was a short and sweet review of the technique, but it left out the thought process that should accompany any charting effort.

It’s easy enough to apply a logarithmic scale to a numerical axis in an Excel chart. Select the axis, then press Ctrl+1 to open the Format Axis task pane (or dialog in Excel 2010 and earlier). In the same view that allows formatting of scale parameters like minimum and maximum, you should find a checkbox that says Logarithmic Scale. Check the box, and the scale is adjusted. You can also change the base of the log scale, but it’s usually best to keep it at 10.

Here are my two charts with log scales. The data I’m using here is very similar to the data in the MyExcelOnline post cited above.

Logarithmic Scales

Your first thought might be satisfaction that the data are now all shown with a similar magnitude. That’s wrong, of course, because they aren’t of similar magnitude. But that’s secondary, because we are really only comparing extreme values with very little in between.

The other point about the charts above is the scale limits. Excel likes to use zero as one endpoint of an axis, and 1 corresponds to zero on a log scale. However, this choice of scale minimum means that the bottom two-thirds of the chart is blank, all of the data occurs in the top.

We can reduce the white space by entering a better value for axis minimum.

Adjusted Logarithmic Scales

There is still a problem, especially with the bar chart. In a bar chart, our precognitive sense of the values is based on the lengths of the bars, and it’s hard to overcome that. When I look at the bar chart above, the short bars all look about half as tall as the long ones, so I get the mistaken impression that the smaller values are around half of the large value. (The first logarithmic column chart is even worse, since the shorter bars are 80% as tall as the long one.) The actual ratio of values is less than one-tenth.

The log scale also tends to wash out the variability in the shorter values. In the charts below, I show the previous log scale charts, and beside them a linear scale chart with the scale blown up so the mean of the linear scale and log scale are at about the same height. Of course the large value is way off scale, which I’ve tried to indicate by fading the top of the visible indication of the large value.

The variability in the small values is shown best using the linear scale.

Linear vs. Logarithmic Scales

I should mention a couple more points about logarithmic scales.

First, as should be clear above, there is no place for log scales in bar or column charts. Bar and column charts need to include zero in the axis scale, but you can’t include zero on a log scale, since log(0) is negative infinity.

Second, most audiences will not be able to properly appreciate a logarithmic scale. Sure, engineers, scientists, and quants may have a strong enough numerical sense to interpret them accurately. But general audiences, and even smart managers and executives, are likely to be misled by such numerical transformations. It’s best to stick to the linear scale that most people are comfortable with.

Show Part of the Data

The charts used above aren’t bad, showing that linear scale charts preserve variation in the data better than logarithmic charts. Maybe we don’t need to actually show the extreme values on the axis scale, if we just indicate that they’re way off scale.

I can adjust the scale a bit more, to center the smaller values in the chart, keeping the scale minimum at zero, and fading the top of the largest value. The large value is so much bigger anyway, maybe we don’t need to actually show it to indicate that it is much larger than the others. This fading is a bit tricky, but it’s important to indicate that the data point isn’t just at the top of the chart, but instead extends far beyond.

Linear Scales Better Than Logarithmic Scales

 

Break the Axis Scale

A common approach to chart a wide range of values is to break the axis, plotting small numbers below the break and large numbers above the break. An advantage here is that it generally uses a linear scale. A disadvantage is that it distorts data, and doesn’t really give a sense for the differences in value on either side of the break.

Here are the two original charts, with a break in the vertical axis scale.

Broken Axis Scales

Bravo, you got in all the data points. But despite the visual cues that the axis has broken, there is still a strong tendency to mentally interpolate the values: those short bars still look half as tall as the large one, and the faded center and gap in the axis tell my conscious mind but not my precognitive mind that my first impression is wrong.

Sure, it’s easy enough to read the value off the axis scale, corresponding to the data point. But if you have to do that much work, what’s the point of a chart? You might as well just read the data from the cells in the worksheet.

Another drawback to breaking an axis is that it’s hard. You need to hide the real axis, construct two parts of a fake axis with a combination chart and data labels, and change at least some of the data you’re plotting. Nobody understands how to make these charts, and nobody understands the output anyway. So why bother.

The charts in the Show Part of the Data section are more effective.

Use Multiple Charts, or Make a Panel Chart

A lot of people are obsessed with getting all of their data into exactly one chart. Sometimes this is fine, but as we’ve seen above, sometimes charts with all of the data are not very easy to interpret without distorting the relationships within the data.

What’s wrong if we use two charts? We can show the whole range of data while highlighting the larger values (see the first chart of this article), then add the chart from the Show Part of the Data section which highlights the smaller values.

Two Charts to Show The Data

There are some redundant chart elements, so let’s hide the category labels in the top charts and the titles in the bottom charts.

Two Charts Beat as One

From here it’s not a big stretch to combine both charts. Plot the original data twice, once each on the primary and secondary axes, do some heavy axis formatting, and voila. Here are panel charts, where one panel shows the full extent of the data, and the other zooms in on the smaller data.

Panel Charts

In my tutorial Broken Y Axis in an Excel Chart, I explain some of the shortcomings of a broken axis scale, and I give step-by-step instructions for creating exactly this kind of panel chart. Panel charts are a little more complicated to build and maintain, but using a single chart helps with alignment and other formatting.

Panel charts are among my favorite ways to show this kind of data. But we aren’t done yet with alternatives.

Pareto Charts

A Pareto chart is a combination chart that combines a column chart (sorted from largest value to smallest), like the one that led off this article, with a line chart showing the cumulative total.

Here are two versions of a Pareto chart for this data. The Pareto one on the left is sorted by value. The one on the right is sorted by month, so the cumulative line doubles as the cumulative YTD sales.

Pareto Charts

Below is a combination of Pareto chart and waterfall chart, which I call a “Floating Pareto” chart. The bars show the incremental values and the increasing additive value. The one on the left is sorted by value, while the one on the right is sorted by month, again showing YTD sales.

Floating Pareto Charts

First and only shameless plug of the entire article: the Advanced Edition of Peltier Tech Charts for Excel 3.0 includes Pareto and Floating Pareto charts. Check it out.

Think Further Out of the Box

You always have to keep in mind why you need to plot this data. If you want to say that our sales team kicked ass in January, then the sales director left for a competitor and took with him his account book and his entire staff, you can even get away with a pie chart.

A Pie Chart?

Sure, you’re thinking, “A PIE chart? Wut? Has he lost his mind? Pie charts suck.” They’re not good at showing numerical data, and blah blah, yada yada, etc etc etc.

Well, I’ve softened a bit in my stance on pie charts, in part because of some actual research into their effectiveness by Robert Kosara of EagarEyes and Tableau (go read An Illustrated Tour of the Pie Chart Study Results, and read the papers that post links to). Seriously, the chart above practically shouts, “What the hell changed between January 31 and February 1?” There really seems little point in examining the month to month variation from February on.

If you want something slightly more quantitative, you can make yourself a nice little stacked column chart. The chart below clearly says that January sales were enormous, more than the rest of the year combined.

A Pie Chart?

Why do you want to chart a wide range of values?

As with any charting exercise, you need to ask some questions. What am I trying to learn from this chart? Who am I making this chart for? What message am I trying to give them? What’s on Netflix tonight?

If your purpose is a quick overview, then something like the Pie Chart or Stacked Column Chart might be all you need. If you’re trying to show something in more detail, the Pareto Chart, the Panel Chart, or a chart from the Show Part of the Data section may be more appropriate. If it’s detailed engineering data or scientific model predictions and your audience is highly literate in mathematics, then consider Logarithmic Scale charts.

 

Peltier Tech Chart Utility

Excel Plotted My Bar Chart Upside-Down

Here’s a problem that I’ve heard people ask (and complain) about. They’ll have data in their workbook, neatly sorted from top to bottom. When they make a bar chart, the sorting is reversed, with the Bottom data appearing at the top of the chart. Not only that, but the series are in backwards order too. If they sort the data in reverse order, the chart’s categories look right, but now the worksheet is upside-down, and the series are still in the wrong order.

Problem: Bar Chart's Data is in Reverse Order

The reason for this arrangement is logical, once you figure it out. To illustrate, let’s start with this data. Yes, the data in column A is mixed up. Instead of 1-2-3-4, it goes 1-3-2-4, to help illustrate differences between XY and Line charts at the same time.

Sample Data

Here is an XY chart made from this data. Looking at Series 1, we see it starts at X=1, proceeds to X=3, then X=2, and finally X=4. No surprise, since it’s following the data, which as I pointed out above, is in mixed up order.

Series and Point Position for XY Chart

Big deal, no surprises. But let’s look at the chart axes for a moment, and review 4th grade math.

By default, the two axes will be located along the bottom and left edges of the chart. Where the axes intersect is called the origin, and it is where both axes have their minimum values (we’re dealing with non-negative data). Lower values are plotted closer to the origin, and higher values further away.

Let’s make a Line chart with the same data. It looks different from the XY chart. No, the series are formatted the same, with markers and lines. But check out the horizontal axis. The numbers are not sorted numerically, they are listed in the order they appear in the worksheet, as if they have no numerical value. In fact, in line, column, area, and bar charts, Excel treats X values as non-numeric labels. (Unless the X values are dates, and I’ll cover that another time.)

Series and Point Position for Line Chart

Note that the axes still cross at the origin in the lower left of the chart. Lower Y values are closer to the origin, and X values encountered earlier in the worksheet are closer to the origin.

Okay, easy enough. In a line chart, Excel lists the X axis labels in the same order as in the worksheet, and the origin is at the bottom left of the chart.

Same with a column chart. Note in the column chart that the order that series are clustered is also the same order as in the worksheet.

Series and Point Position for Column Chart

Now let’s make a problematic bar chart.  1 is in the top of the worksheet range, but at the bottom of the axis. But remember, the origin is at the bottom left, and the lowest values and first labels are located closest to the origin. So the “1” label on the vertical axis is closest to the origin, that is, lowest in the chart, even though it was highest in the worksheet. The series are also in the order they are because series 1 is closest to the origin.

Series and Point Position for Bar Chart

If you really need the chart to be arranged the other way, it’s a simple two-step fix. First, format the vertical axis, and check the box for “Categories in reverse order”.

Format Axis - Categories in Reverse Order

See, now the top label in the worksheet is also the top label in the chart. Reversing the categories has moved the origin to the top left of the chart, so the bottom axis is now at the top of the chart. Note also that the series are now listed in the expected order.

Series and Point Position for Bar Chart with Reversed Category Order

If you want the horizontal axis back at its customary bottom position, format the vertical axis again. (You can do this step at the same time as the earlier Format Axis step.)  Select the option button for “Horizontal axis crosses: At maximum position.”

Format Axis - Axis Crosses at Maximum Category

Since the vertical axis maximum is now at the bottom, that’s where the horizontal axis appears.

Series and Point Position for Bar Chart with Reversed Category Order and Axis Crossing at Max

Now we know enough to go back and fix the original chart. Format the vertical axis, reverse the category order and make the horizontal axis cross at the maximum.

Problem Solved: Bar Chart's Data is in Expected Order

Top is Top, Bottom is Bottom, series are plotted in 1-2-3 order, and we can finally rest easy.

 

Peltier Tech Chart Utility

John Walkenbach’s Lime Crop

Excel author extroardinaire and amateur banjoist John Walkenbach describes this year’s Lime Crop this weekend in The J-Walk Blog. I’ve reproduced John’s chart below.

Bar Chart of John Walkenbach's 2009 Lime Crop

In a completely unwarranted statement, John snidely wondered how long until I criticized his chart.

Man. How did I deserve that… Oh, yeah.

Well, he asked for it.

Here’s what I did like about John’s chart:

  • He used PNG format for the image file.
  • The chart clearly shows most of the data.
  • The use of lime graphics was not overboard, in fact, one lime per lime is an appropriate  scaling factor.

Here’s where John’s chart needs improvement:

  • His ink to data ratio is way too high: look at that distracting shadow.
  • His vertical axis title is tilted 90°, making it hard to read.
  • The zero limes harvested in 2007 and 2008 are not clearly shown.
  • Time series data is usually more effective on a line chart.

I’ve made the appropriate changes in the following chart.

Line Chart of John Walkenbach's 2009 Lime Crop

I guess we should call this a Lime Chart.

 

Peltier Tech Chart Utility

9 Steps to Simpler Chart Formatting

A complaint about many charts in general, and Excel charts specifically, is that they look awful and are hard to understand. However, you have the power to make your charts clear and clean and easy to read. The way to improve your chart is to remove clutter and reduce the amount of ink used to print the chart. To do this you must keep asking: does the chart need this feature, and if so, does it need to stick out so strongly?

By “Simpler Chart Formatting”, I don’t necessarily mean easier ways to format the chart. Instead, I mean ways to make the formatting of a chart simpler, less cluttered, and easier to understand. The techniques are also pretty easy. At first, the hard part is just remembering to carry them out.

As we all know, the Excel defaults are pretty ugly. They are somewhat better in Excel 2007, but I still never use the defaults. Aside from this set of images, I will stick to “Classic” Excel, that is, Excel 2003 and earlier, but the general concepts hold true for any version of Excel, and any graphics package.

Default Excel Column Chart     Default Excel XY Chart
Default chart formats in Classic Excel (2003 and earlier).

Default Excel Column Chart     Default Excel XY Chart
Default chart formatting in Excel 2007.

1. Backgrounds

The default charts in Excel 2003 and earlier feature a dull gray background (British and Canadian versions use dull grey instead). The first step to cleaning up your chart is to change this ugly background to white.

Excel Column Chart     Excel XY Chart
Removing the muddy background is a major improvement.

2. Borders

How important is the border around the plot area, around the legend, or around the entire chart? The legend never needs a border. Other borders can often be removed entirely, or at least lightened.

Excel Column Chart     Excel XY Chart
Removing the outer border and the legend border cleans up the charts.

Excel Column Chart     Excel XY Chart
Removing the plot area border and gridlines makes the charts simpler.

Sometimes I use a light gray chart area border on embedded charts, so the chart border matches the gridlines between cells in the worksheet.

Excel XY Chart

Sometimes a light border is okay.

3. Gridlines

If your chart uses gridlines, they should be the lightest features on your chart. The custom palette I use for my own work has a special gridline gray several shades lighter than the 25% gray on the default palette. But often gridlines are not even needed.

Excel Column Chart with Dark Gridlines     Excel Column Chart with Light Gridlines
Above: Dark gridlines (left) and light gridlines (right).
Below: No gridlines and no plot area border.

Excel Column Chart with No Gridlines

4. Axes

De-emphasize the axis lines by using a light or medium gray instead of black. You could even use medium gray for the axis tick text, but I usually stick with black text. For a category X axis, you can eliminate tick marks.

Excel Column Chart     Excel Column Chart

If you are using gridlines, you may be able to remove the corresponding axis line, leaving only the tick labels.

Excel XY Chart

You can often unclutter a chart by using fewer tick mark labels. As pointed out in the comments, keeping the unlabeled tick marks is useful.

Excel XY Chart     Excel XY Chart

5. Number Formatting

Use custom number formats to reduce the complexity of axis labels and other text in the chart.

Excel XY Chart     Excel XY Chart

Above left: original cluttered axis labels.
Above right: remove unnecessary ‘cents’.
Below: replace thousands by ‘K’ suffix.

Excel XY Chart

6. Chart Types

Replace every 3D chart with the corresponding 2D type.

Excel 3D Column Chart     Excel Column Chart

Only use pie charts if there are three or fewer wedges. Use column or bar charts for more categories.

Excel Pie Chart     Excel Pie Chart
Cluttered pie (above left) and simpler pie (above right).

Excel Column Chart     Excel Bar Chart
Unclutter a pie chart’s data by using a column or bar chart.

Use column charts to show values for discrete categories. Use line charts to trends over time.

Excel Column Chart to Show Values for Categories     Excel Line Chart to Show Trends in Values

7. Series Formatting

For column, bar, and area charts, remove the black outline. Use light to medium colors for fills, and don’t use patterns or gradients. Use darker, saturated colors for lines and markers. Use colors with appropriate contrast against the background and compared to other series colors. Avoid combinations that react with each other.

Too gaudy (left) and too faint (right).

Excel Column Chart     Excel Column Chart

Better color combinations.

Excel Column Chart

Excel Area Chart     Excel X& Chart

8. Horizontal Text

Vertical text is difficult to read, and inclined text on a monitor is bad because of distortion of the characters.

Excel Column Chart     Excel Column Chart

In a column chart, Excel will leave out some tick labels if they would overlap, but this leaves the readers guessing about the missing labels.

Excel Column Chart     Excel Column Chart

Turn the whole chart on its side.

Excel Bar Chart

9. Labels, not Legends

It is easier for a reader to identify series if they are directly labeled. Using a font color that matches the series formatting is also helpful. A legend takes up valuable space and makes the reader divert attention back and forth.

Excel Pie Chart     Excel Pie Chart

Top left: Ineffective Legend. Top right, bottom left and right: Effective Labels.

Excel Column Chart     Excel XY Chart

 

Peltier Tech Chart Utility

Peltier Tech Charts for Excel 3.0

 

Create Excel dashboards quickly with Plug-N-Play reports.