The Misleading Chart
My friend and colleague Patrick Matthews, a former Excel MVP, posted a screenshot of an unusual bar chart on his Facebook page. The chart was taken from What does the public say about impeaching Trump?, the last section of a Washington Post article titled What happens next in the impeachment of President Trump? Patrick’s comment says it all: “Bar lengths on a chart, what do they even mean?”
At the risk of opening a torrent of political comments, I’ve reproduced the chart here.
Take a close look at the bar lengths in the first chart. The 12% bar is over half as long as the 85% bar, where in a bar chart with proportional bars, the 12% bar should be about 1/7 as long. But at least the 49% bar is slightly longer than the 47% bar, and they are in between the 12% and 85% bars. The same holds true of the bar lengths in the second chart.
Someone responded to Patrick’s post, wondering how they came up with those bar lengths. After the analysis in the previous paragraph, I replotted the data, set the axis scales to -100% to +100%, and set the vertical axis to cross at -100% on the horizontal axis. Nailed it!
Well, not exactly. As I sometimes do, I overanalyzed the charts. I’ve stripped most of the text from the WaPo graphic, replaced the outlines of my charts with red lines, and stretched my charts so they overlaid the WaPo plot.
It turns out that the axis minimum was really -92%, so my wild guess of -100% was pretty good. I’ve set the gridline spacing so that 0% and +92% are shown on the chart, and the far right edge of the plot area is at +100%.
I don’t think the graphic artist really used an axis minimum of -92%. I’m sure they started with 0%, then decided to fill in some white space by dragging the left edges of each bar while keeping the right edge in place. They filled in the space, all right. But by doing so, they obscured the differences between the values.
It’s the same issue that occurs when people start their axis at a value greater than zero, so the differences between values are accentuated. But now the axis and the bars start well below zero, and the differences are minimized.
Fixing the Chart
My next step was to take my two charts, and set their axis minimum to 0%. These two charts now accurately show the relative percentages.
Improving the Chart
Those last two charts were a big improvement. But if we’re expected to compare the values, shouldn’t the bars all be in a single chart? Below I plotted the negative of one set of data, so the bars stretch in opposite directions, the way they do in population pyramids. Let’s call this a diverging bar chart.
Then I remembered why I dislike population pyramids, as I discussed ages ago in Tornado Charts and Dot Plots. It’s hard to compare bars that reach away from each other. It would be easier to compare the values of any two bars if they start at one horizontal position (the vertical axis) and stretch in the same direction (to the right). So I created this clustered bar chart:
An alternative is to plot one set of bars from left to right, and the other from right to left. It’s a converging rather than a diverging bar chart. This makes individual bars more difficult to compare, as in the population pyramid lookalike above. But the white spaces clustered between the colored bars represent the percentages of each category who have no opinion.
What do you think? Not about the topic of the chart, but about the construction of the chart. Do you prefer the diverging bar chart, the clustered bar chart, the converging (stacked) bar chart, or something else entirely?
Patrick Matthews says
I feel seen :)
Patrick Matthews says
I’d go with the clustered bar chart. Easily enables both intra-group and inter-group comparisons. Next-best is the converging stacked chart, which is the only one which gives a graphical representation to the “no opinion/don’t know” option (although that could be readily added to the clustered bar chart).
Jon Peltier says
Patrick –
Thanks for the inspiration. I think my preferences their reasoning match yours.
Jon Peltier says
FWIW, here’s a Diverging Stacked Bar Chart I made using Peltier Tech Charts for Excel.
JUSTIN BENTLEY says
Prefer the converging bars in this instance, otherwise the clustered bar.
Tornadoes still have their place I believe. Have found these useful for visualising two metrics (area series chart type, semi transparent fills)
Jon Peltier says
Justin –
In general I would prefer the clustered bars, but given the gaps in the converging bar charts that represent “no opinion”, I can go either way.
worm says
I think the diverging bar chart shows the polarisation across the Rep/Dem divide more clearly than the clustered chart, but as with all things, it depends what story you are trying to tell. The different harts give clarity in slightly different areas, so it depends which aspect you want to highlight.
Peter Bartholomew says
I preferred your final chart but
1. The relationship between ‘overall’ and partisan got lost
2. The importance of ‘Independent’ in terms of voter numbers is not apparent.
3. The use of blue causes confusion as it is associated with dems.
You will absolutely hate my solution though.
Four pie charts linked by block arrows to show that the 3 smaller represent a breakdown of the overall pie.
Areas proportional to the voter numbers
Colours orange/green for impeach/not
It takes a lot of screen real estate but I will worry about that if I have to start paying for it!
Hilary says
In this example, when all the no opinion proportions are under 10% and not an important part of the message, I’d go for the clustered bar chart. But I can imagine other cases where I’d choose the converging bars to flag up those with no opinion.
Andy says
I like the diverging bars because it quickly shows the big difference of opinion between the parties.