Use Dot Plots for Better Categorical Comparisons

In Yikes! Another Pie Horror Show, I was showing problems with yet another pie chart. I know the criticism of pie charts getting tiresome, but I wanted to respond to a comment by Jeff Weir:

Q: What’s worse than comparing categories within a pie chart?
A: Comparing categories between 2 pie charts.

I will use an article Jeff cited to show a better way to make comparisons.

The Analysis

Jeff pointed to Why are men more affected by labour market downturns?, a study by the NZ Department of Labour. The article stated that of the 34,000 jobs lost by the NZ economy in the eight quarters ending September 2009, 80% were lost by men.

This disparity in job losses can be seen through an analysis of the jobs held by men and women. Many more men hold jobs in manufacturing and construction, sectors which are hard hit during recessions. Women on the other hand are more likely to be employed in fields such as education and health, which are largely government funded and tend to be less sensitive to economic conditions.

The Visualization

This is a well thought out analysis, but as Jeff pointed out, comparing wedge sizes in two different pie charts is not an effective way to support the argument. Their two original pie charts were huge, and not even posted on the same page, though they did show barely legible thumbnails adjacent to the text of the article. The following two charts show the breakdown of jobs by men and by women. I’ve enlarged the pit thumbnails somewhat, to improve legibility and to compare with an alternate display I’ve created below using Dot Plots.

Pie Chart: Seasonally adjusted male employment by industry, Q3 2009

Seasonally adjusted male employment by industry September 2009 quarter

Pie Chart: Seasonally adjusted female employment by industry, Q3 2009

Seasonally adjusted female employment by industry September 2009 quarter

The Alternative Visualization

I’ve made two dot plots to show the same data. Within a couple pixels, the Dot Plots (including title) are the same size as these pie charts (without title), but each Dot Plot shows twice as much information, the Dot Plots have much more legible text, and the comparisons are much easier to make within a Dot Plot than between multiple pie charts.

Dot Plot: Seasonally adjusted male and female employment: number of jobs by industry, Q3 2009

Dot Plot: Seasonally adjusted male and female employment: percentage of jobs by industry, Q3 2009

It would be so easy to make a few simple annotations on one of these dot plots to drive home the point of the analysis.

Dot Plot: Number of jobs by industry, Q3 2009, showing susceptibility of men and women to economic conditions

Dot Plots

Dot Plots have been developed as a more effective tool for categorical comparisons than bar charts, pie charts, or (gasp!) donut charts. You can read about them in several places:

The last three links show various techniques for creating Dot Plots in Excel. Excel doesn’t make it easy to create Dot Plots: you need to use one helper series technique or another to generate the text labels along the vertical axis. The easiest approach is probably to use a horizontal bar chart with hidden bars to supply the labels, and an XY chart to supply the data points.

To simplify the creation of Dot Plots in Microsoft Excel, I’ve developed a utility that generates a dot plot in a simple button click. The utility is a standard Excel add-in, and it adds a Dot Plot button to the Excel menu (versions 2000 through 2003) or to the Excel ribbon (versions 2007 and later). The button pops up a simple dialog for the user to select formatting options.

PTS Dot Plot Utility Dialog

Excel detects a preselected data range, and remembers the Chart Lines options from the previous time you used the utility. It produces a very simple Dot Plot adjacent to the data range.

PTS Dot Plot Utility Output

Two minutes of formatting produces charts like those I used above to support the findings of the NZ Department of Labour.

PTS Dot Plot Utility The PTS Dot Plot Utility is available from the Peltier Tech web site for a nominal fee. It can be downloaded as soon as the purchase is approved, it installs itself using the downloaded setup file, and the button appears the next time Excel is started.

Click on the icon to the left or visit PTS Dot Plot Utility to check it out.

To learn about Peltier Tech’s other utilities, visit PTS Excel Charting Utilities.

<a href=”http://peltiertech.com/some-comments-on-dot-plots-guest-post/” title=”Some Comments on Dot Plots (Guest Post) » Peltier Tech Blog”>Some Comments on Dot Plots (Guest Post) » Peltier Tech Blog</a>

Peltier Tech Chart Utility

Comments

  1. Hi Jon,

    Much as I like your Dot Plot utility, I think that in this case the data lends itself to a simple bar chart.

    It is easy to make the female numbers negative and draw a bar chart that looks like a tornado chart.

    Note that I have used a custom format on the x-axis to remove the minus from negative numbers.

    And adding dummy series on the secondary axis (and a little formatting) will highlight the relevant details nicely.

    To my mind, Dot Plots are better at showing detailed values, but tend to look more cluttered than bar charts. In this case we don’t really care if the male participation rate in Construction is 13.5% or 14%, we just want to see that its much bigger than the female rate. So a bar chart should suffice.

  2. Funnily enough, after being one of those New Zealand males affected adversely by the recession (albeit just for a short period) , I just landed a job as a senior analyst at none other than the New Zealand Labour Department. I love irony.

    Here’s a screenshot of how the same data works in a bar chart

    While the dot plot and bar chart are definitely an improvement, the problem remains that we are looking at a static photo of a particular year, when we could be looking at how things have changed over a year. So I whipped up a graph in Excel then tweaked in Word that shows the absolute changes by sector and gender over a year, and gives the net change at the top.

    This shows that female jobs in manufacturing actually fell more than male jobs: -15,000 female (-17%) as opposed to -13,000 male (-7%). So men actually sufferred less overall than women here. In fact, the biggest fall for male employment over the last year was in retailing and accommodation – 14,000 jobs; while female employment grew here by 2000 jobs (1%). I wonder what is going on here?

  3. The dot plot could use some Bertin-style reordering, from Men Most Affected to Women Most Affected, say.

  4. Jon,

    is your DotPlot Utility capable of handling PivotTable as ChartSource or do you know any workaround for PivotData?

    Thy

  5. Ed,

    In general bar charts get more cluttered than dot plots. It’s the lines connecting the points that clutter these dot plots. We should be comparing men to women in the same
    industry rather than men to men in different industries. The lines move our eyes from men in manufacturing to men in construction rather than to women in manufacturing.

  6. Ed –

    The problem with bar charts is when you have more than about two series, they become too cluttered to interpret.

    The problem I have with left-right tornado charts is that the comparison between the left bars and right bars is made difficult by not aligning them next to each other. I wrote about this in Tornado Charts and Dot Plots and I offered a bunch of alternatives (including Dot Plots). But some readers stepped up to defend the left-right charts, saying “everyone [in this field] uses these ‘population pyramids’, so it’s okay”. Fine, but I still don’t care for this approach. Anyway, for this reason, I prefer Jeff’s bar chart.

    Some people say Thou Shalt Not Use Line Charts Along A Categorical Axis. Thus, a Dot Plot with connecting lines is unacceptable. I understand Naomi’s point that the lines add clutter, and perhaps she’s right. However, I use the line as a way to connect points in a series. I tried using lines which were lighter than the markers, in an attempt to de-emphasize any implied trend from point to point and reduce the clutter.

  7. Derek –

    The data is already in Men Most Affected ordering. Do you mean we need two charts, or am I missing something?

  8. Chris –

    The Dot Plot utility can handle pivot tables as the data source range. It will not automatically update when the pivot table is refreshed. If the cells in the charts data range change, so does the chart, but if the pitov table changes its shape, the chart does not reflect this change. The utility is not smart enough to deal with discontiguous ranges… maybe some day when I have time to program enhancements.

  9. I don’t say never use line charts along a categorical axis because I have seen some cases where they are very useful. Here’s an example: patients are screened for a condition by being given five tests. Perhaps patients with high scores on test 1, 3 and 5 and low scores on test 2 and 4 need further testing. Plotting the results and connecting the points lets the researchers look for patients with a W pattern. Connecting the points makes looking for patterns easier. But in general I don’t like connecting points when it doesn’t make sense to interpolate between them and when you’d see a completely different pattern if you ordered differently.

  10. Sorry, I meant instead of ordering from industries men are more likely to be in to industries men are less likely to be in, I’d order from industries men are more likely to be in compared to women to industries men are less likely to be in compared to women (as determined by subtracting the two percentages).

    In practice, I see this actually only means switching Professional and Retail in this case. I was also a bit confused by the “Other” category, which I didn’t find useful since it’s just the bin that makes the whole thing up to 100% for each sex. I’d drop that category and just present the sequence C, M, P, R, E, H.

  11. Naomi –

    I was not thinking of you when I cited the No Lines commandment: your approach has always seems to be utilitarian, not totalitarian. Your guidelines for when to use lines are helpful.

    One of the most useful cases of lines in a categorical chart is in a parallel coordinates plot. Without the lines, the chart would be useless, yet I’ve had the discussion that I shouldn’t use lines.

    Here are my dot plots, stripped of their lines. Does anyone have a comment?

    Dot Plot: Seasonally adjusted male and female employment: percentage of jobs by industry, Q3 2009

    Dot Plot: Number of jobs by industry, Q3 2009, showing susceptibility of men and women to economic conditions

  12. Interesting discussion in the comments. Naomi made the comment that I came here to make: that the lines are connecting unrelated data and lead the eye to follow a pattern that’s not really there. I think your revised charts work perfectly well. I like them better than Jeff’s bar charts, because I think the bars make me more likely to look at the patterns of the blue bars in relation to each other rather than in relation to the red bar next to them. With the dot-plot I can also quickly discern how many categories have men “leading” vs. women, while the bars seem to make my brain work harder at getting to that.

  13. I am with Naomi on not serially connecting items that are randomly ordered. yes, it is nice to see the two series, but the angles created by the lines between categories are NOT data–in fact they are artifacts. The segmentation of male/female and by category provide dimensions that you feel are important to understand, but the within gender or between gender hypothesis could drive two different sets of plots. Your last dot plot introduces exogenous data (the change over time) to the graph–but we do not see this dynamic in the graph. A table sorted by the % of men in the industry then showing the change in unemployment over the period would help establish the claim better than the current dot plot.

  14. I think Chip’s right that the bars tend to work for blue-on-blue comparisons at the expense of blue-on-red, and that it’s harder to see at a glance who is leading who across all categories.

    So technically I think dot plots are a better medium for people that understand charts well. But are they the best medium for the majority of people more familiar with bar charts? So something we should factor in is the ‘reading age’ of the publication. This source material is public interest, not an article in a scientific periodical. So perhaps we need to err on the side of ‘understanding of the masses’.

    While the lines on bar charts add ‘clutter’, I think that the upside is that this clutter helps draw the eye back to an axis. Whereas with the dot plots, you look at the dots and then have to decide ‘what do I have to connect these things to?’ or ‘where should I look next?’

    Bar charts may be for sissies, but if that’s what most of our readers are, then perhaps bar charts it is, then.
    Regardless, even more important than chart style is chart content. If this publication is about the effect of the recession across sexes and sectors, then plotting how things have changed over a year is a better choice than plotting how things look right now… as I’ve done in my 2nd bar chart above.

    Content is still king. Or to paraphrase Tufte, is this the best data we can present to illustrate this story?

  15. Here is what I had in mind. I thought of another advantage of ordering by increasing difference between two series: the two series are by definition guaranteed to have at most one crossing point, minimizing confusing clutter.

    I’m a convert to the value of lines between the points in nominal category series, thanks to the work of Alfred Inselberg.

    BTW, this is my lazy dot plot method– I just make an Excel line chart and rotate it 90 degrees when its time to save the image using my favorite photo app.

  16. Yikes…labor pies everywhere. http://www.bls.gov/opub/ted/

  17. I agree with Ed Ferarro and especially Jeff Weir here. While either a dot plot or a bar chart is a clear improvement over the original pie chart, none of them convey the full story in this data.

    The oft-discussed loss of jobs in male-dominated manufacturing actually seems to be hurting women more than men in this NZ data, as Jeff pointed out by his supplemental chart of jobs lost/gained by sector and gender. It is only the big gain of jobs by women (but not men!) in the Health field that allows the recession overall to impact the employment of women less than men in NZ. A bar/tornado chart is a very effective way to convey such nuances.

    When zero matters, and there are only a few (1-3?) series per cluster, a bar chart seems easier to grasp than a dot plot — even for those familiar with dot plots.

  18. Just found another blog that covers some of this ground: Eagereyes. Check out

    http://eagereyes.org/criticism/cost-of-a-sick-chart

Subscribe without commenting

Trackbacks

  1. The blog post « Pingback says:

    […] Use Dot Plots for Better Categorical Comparisons » PTS Blog […]

Write a Comment

I welcome comments from my readers. If you have an opinion on this post, if you have a question, or if there is anything to add, I want to hear from you. Whether you agree or disagree, please join the discussion. If you want to include an image in your comment, post it on your own site or on one of the many free image sharing sites, and include a link in your comment. I'll download your image and insert the necessary html to display the image inline.

Subscribe without commenting

Peltier Tech Chart Utility

 

Create Excel dashboards quickly with Plug-N-Play reports.