## Heat Map Donut Charts

My colleague Debra Dalgleish steered me toward an article about a ‘Hot doughnut’ chart in Excel.

Hmm, very interesting. Eye-catching.

You can take the above and combine it with target values in another concentric ring, add a few labels, and make it really pretty. This is from a companion article, How to create a heatmap doughnut chart.

Despite its attractiveness, at first glance I didn’t think it was very effective. You know, donut charts being even less effective than pie charts. But I sat down and went through at least the preliminary steps of recreating the chart.

Note: I apologize for the use of jpeg images. On one of my monitors they look absolutely horrendous, with terrible artifacts everywhere, but on the new monitor they’re okay. I normally use png images for my charts, but some of the images in this article were only available as jpegs.

## How to Make a Heat Map Donut Chart

Here is the data. It looks unsorted, but I’ll describe the unusual sorting order shortly.

Make a nice donut chart (as if there ever were such a thing).

Recolor the wedges based on value (red at the large end of the values, through orange, yellow, and green, to blue at the small end).

Perhaps we need another legend to clarify the sequence of the color codes?

Remove any size data for the slices, using 1 for each data point’s value. Let’s assume we don’t need sizes, since the colors are encoding the values.

In the previous recolored charts I kept a thin white border on the wedges, so adjacent wedges of the same color don’t just look like one larger wedge. In this chart, such adjacent wedges merge into a single wedge.

Now smudge the colors between the centers of adjacent wedges. I didn’t actually do this; below is a screen shot from the original article. The approach I’d take is to divide the wedge into a number of smaller wedges, and gradually change each mini-wedge’s color to simulate a gradient from the center of one wedge to the center of the next. Start with all blue, change to mostly blue plus a little green, then to still mostly blue plus more green, to mixed blue-green, to green with some blue, to green with just a little blue, to all green.

This chart still needs labels for the wedges, and probably a data table so you can see the values which are obscured by the artistic effects.

The last few charts illustrate the unique sorting. This actually took me a while before I noticed it. The smallest point (blue) is at the top and the largest (red) at the bottom. Some of the points go clockwise from the smallest to the largest, and the others go counter clockwise. If you start in one place, the values go from small to large and back to small, like a sine wave. This provides two “continuous” color paths, so that smearing of colors between one wedge and another doesn’t introduce an intermediate color from the scale.

## What Makes the Heat Map Donut Chart Ineffective?

Before diving into this critique, I want to point out that it is important to experiment with visual techniques. We should display our data using a variety of existing approaches to tease insights from the data. We should also apply new methods that may make it easier to find certain patterns or make the data more approachable by a wider audience.

However, we also need to review our attempts honestly, so we can concentrate on approaches that work and shelve those that do not.

There are a number of features of the heat map donut chart that make it ineffective as a data display method.

First, the value of each wedge is only encoded by the color in the very center of the wedge, that is, along the spoke that would connect it to the hub of the chart. Gradients in color generally indicate variations in data, but in this case the gradients are gratuitous artwork. Worse than that, the reader may be fooled into thinking there is real data in the spaces between spokes.

### Effectiveness of Encoding Techniques

A more fundamental problem is illustrated by Figure 2 of Presentation Graphics by Leland Wilkinson. This figure shows William Cleveland’s ranking of different graphical features in terms of how effectively they are for encoding and decoding data.

The heat map uses color to encode values. Cleveland’s hierarchy of graphical elements lists color as the least effective encoding means. Color can be effective to indicate different categories (for example, different lines in a chart), but it is not a good choice for displaying continuously variable numerical data.

### Color Vision

Another reason color is a poor choice is that an estimated 8% of the male population (and only about 0.4% of the female population) find it difficult or impossible to distinguish between certain colors. A companion article, Color Vision Issues with Heat Map Donut Charts, uses these heat map donut charts to investigate how color vision deficiencies interfere with color-based data encoding.

## Chart Busters: Fix the Heat Map Donut Chart

No critique of a graphical display is complete without a description of one or more improved ways to display the same data. My improvements are shown in Chart Busters: Fix the Heat Map Donut Chart.

## Chart Busters: The Economist Doesn’t Read Forbes

The Economist showed changing pre-tax profits among banks from 2007 to 2011 in Bank Profits Head East. They chose to use a pair of donut charts for this. Weaknesses of this approach are the separation of the pairs of values into distinct donuts. This forces the reader to jump from side to side, and ultimately skip the charts and read the values in the labels. The combined chart has leader lines to help steer the reader’s eyes from side to side, but this adds clutter, and the labels push the donuts further apart, making visual comparisons more difficult.

Who you gonna call? Chart Busters!

In Arrow Charts and Other Alternatives to Multiple Pie Charts on the Forbes magazine web site, Naomi Robbins introduced Arrow Charts as a replacement for double pie charts (and double donuts are at least as bad). I wrote a tutorial on my blog that showed How to Make Arrow Charts in Excel. The technique takes a bit of work, but once you’ve made one arrow chart, you can use it as a template for new values.

I took the example from my arrow chart tutorial and swapped in the Economist’s values:

The first thing I learned from this arrow chart, which I missed in the double donut, is that most regions showed little change, but two regions showed major changes: Asia Pacific gained a huge percentage while Western Europe lost a similar amount. This is a great example of the effectiveness of arrow charts.

## Charting NBC Olympic Coverage

My friend Jimmy of Code for Excel and Outlook twittered about a chart on TechCrunch that showed the general online assessment of the Olympic coverage by NBC. How We Hate NBC’s Olympics Coverage: A Statistical Breakdown shows an analysis of nearly 20,000 tweets and 5,700 blog posts. The highlight of this analysis is the following chart:

Actually, this isn’t the original chart, theirs was substantially larger. Since theirs was obviously constructed in Excel 2007, I transcribed their data and built my own. Though smaller, this chart lacks none of the features of the original.

Jimmy made some remark in his tweet about the “worst chart ever”, but I have to say, this chart is not even close to the worst ever. It may be in the bottom quartile, but we’re talking about a long, long tail.

## So what’s wrong with the chart, anyway?

We’ve heard many times about how people just aren’t very good at judging angles or areas, and that makes pies ineffective for all but the simplest parts-of-a-whole displays. Donuts take this one step further, cutting out the central bit of the pie, so we’re relying on areas alone, without any help from the angles where all the wedges meet.

Of course, the donut resembles the big fat zero most viewers would give NBC as their grade.

Let’s improve this chart in steps. First, if we sort the data points, we only have to try comparing adjacent points. This also puts the most biting criticism of a sports event, “Not Enough Sports”, right up front.

Now we can put the munchkin back into the donut. We have the angles to help us judge the areas, which may or may not help.

Finally, we can convert the pie pieces into candy bars.

Now the labels are right next to the data points, not off in some distant legend, and the bars are easily ranked by length. The above chart may cause confusion with its multicolored bars, and we don’t want any viewers hunting around for the key to find meaning in the bar colors where none exists, so we use a single color for the bars. Or in this case, two colors. I’ve highlighted “Happily Watching” in a distinct color to set this category off from all the negative ones.

The above chart has two sets of labels. There is the horizontal axis at the top of the chart and the labels at the end of each bar. Is it redundant to have both in the same chart?

We can choose to leave data labels off the points and rely solely on the axis labels.

Or we can keep the data labels on the points, remove the axis, and close up the space between the chart and the title.

Which labeling option to you prefer? Axis labels, data labels, or both?

## UPDATE 3 March 2010

Steve Fleming suggested in a comment below that I move all of the data labels between the category labels and the category axis. Good idea.

Similarly, the labels can be moved from the ends to the bases of the bars.

## Leave the Donuts for the Cops, and Stick with the Bars

Some time ago, I showed how a column or bar chart could display a table of data more effectively than four pie charts (or a donut chart) in Column Chart to Replace Multiple Pie Charts. I showed how to build a panel chart to plot the same data in How to Build a 2×2 Panel Chart. In this post I’ll demonstrate why donut charts are such an awful way to try to present data.

Donut Charts

This is the data used to make the donut chart above, and it’s served us for several other exercises. Rows add up to 100%.

Here again is the donut chart. You can compare values by comparing the included angles, except that only Engr1 and Mktg2 have a common baseline for all points in those categories. If you consider consider the Mktg1 (blue) sections. Three of the four have values between 19.7% and 20.0%. Without consulting the table above, it’s mighty hard to tell which doesn’t fall within that range.

Of course, we could apply data labels that display the values, but to make them fit, some have to be rotated. In this chart I was lucky, because I could stick to horizontal or vertical orientations, and not any of the pixel-squashing inclinations in between.

It’s still hard to remember which concentric arc refers to which series. Let’s change the labels to show series names. Okay, that’s better, but now I can’t remember the percentages.

The dialog lets us add both to our label, in fact, we could also add the categories from the legend. But the labels are already overcrowded. I suppose if we wanted to, we could find a package that lets us wrap the labels around an arc, but I’m glad Excel doesn’t offer that option. But at least we have all the data visible in the chart that was in the original table of data, only not as easy to read.

Donuts, an Exploded View

I’ve pivoted the data so all values are in one column. I’ve also calculated the area of each segment, with a scaling factor that conveniently equals 100 for the hole in the center. The innermost arc has a total area of 300 (that is, three times the area of the central hole), the second 500, the third 700, and the outermost 900. If you’ve been listening to me for a while (I mean months, not just this post), you can guess where this is going.

To start this analysis I had to explode the donut chart. Quick, call Dundas, a new chart type! This was easier than I had expected: I copied the donut as a picture in Excel, pasted onto a PowerPoint slide, and ungrouped twice. Then I dragged the pieces into position.

Then I arranged the segments in order of ascending value from top to bottom. As you can see by eyeballing the pieces, or by looking at the Area column or dot plot in the accompanying sorted table, the area jumps around a lot as the value monotonically increases. If a measure such as area is to be a reliable indicator of value, both measure and value should increase monotonically and proportionally.

Here is the same set of building blocks, this time sorted by area. We can tell from the table, but not from the arcs themselves, that value jumps up and down a lot as area increases.

Bar Charts

Let’s compare the donut’s area-value lack of correlation above with the area-value characteristics of a bar chart. This is the data and chart from Column Chart to Replace Multiple Pie Charts.

Here I’ve removed the gaps between clusters and inserted a narrow white strip between adjacent bars. I’ve put labels in the bars so the legend isn’t necessary to identify anything.

Then I sorted the values and rearranged the bars accordingly. The bars are sorted by value and by length. Since each bar has the same thickness, they are also sorted by area.

Comparison

Which shows the better correlation between value and measure of that value?