This is a small entry in the Chart Busters series. I received a request for something which was mechanically possible, but which was not visually desirable.
I was asked whether a certain type of chart could be done. The data was weekly sales of 14 products, but the user wanted to show only the top ten products each week, in a clustered column chart, with the bars within each cluster arranged in descending order. Here is a representative three-week period of data:
Well, long story short, this kind of chart is certainly possible. All it takes is a couple sets of formulas, one set to rank the weekly figures, the next to determine which product goes into which position for each week. The chart is reproduced below. No matter how much I’d like to show off how clever I am when it comes to writing formulas that define an intricate chart source data range, I won’t show how I did this. It’s really an ineffective way to display this kind of data, and I don’t want to encourage anyone by publishing the recipe.
What makes this chart so ineffective? There are too many colors (as Wayne or Garth once said, “That looks like a plate of hurl!”), and the colors are critical to understanding the information. The order of the colors (and the items they represent) changes from one cluster to the next. Because of the excessive colors and their variable positions, the user spends so much time bouncing from one cluster, to the legend, to the next cluster, back to the first cluster, then to the legend, then back to the cluster, then to the legend, ad nauseum. It’s very hard to see where a product falls in on any given week, and tracking trends from week to week is nigh impossible.
As we’ve discussed, the best way to show data over time is with line charts. Showing more than a few differently formatted lines will confuse the data, but I’ve shown how to work this out in other articles on this blog. If we treat the weekly line chart like a parallel coordinates chart, and provide a means for the user to selectively highlight a small subset of the data in the chart, it will speed understanding. For this exercise I’ve decided that three series will be highlighted.
Construction of this chart entails plotting of weekly data for all products, and formatting this data in the same dull color. This gives the user an overview of the data. Then a few more ranges are populated. Three columns are used for the weekly data of three highlighted series, then this data is plotted and given distinctive formatting. Three cells are used to identify each highlighted series so this weekly data can be easily extracted from the full data range using Excel’s INDEX function. Finally, the list of products is placed in a column so we can populate dropdown controls to enable the user to make his selections.
The resulting chart has all the data in a dull blue, with three highlighted series corresponding to the series which were best sellers in each of the weeks.
Can you even follow these three products in the column chart above? Probably not, and certainly nowhere near as easily as this line chart makes it.
I’ve described the dropdown-based chart highlighting techniques in Sample Parallel Coordinate Chart, Easier Interactive Multiple Line Chart, and Interactive Parallel Coordinates Chart, so I will not go over the construction of this chart. I have posted my workbook in a handy zip file, for any interested readers to peruse.
Chandoo says
Very well treated Jon…
Another idea (that I once implemented for a customer) is, have a regular bar chart, one per each week arranged in a table (customers in rows, weeks in columns and the bars in the cells)
I have then added a sort option for each week. So when you click the sort button on week3, it would sort the values in table such a way that week3 is in descending order.
I think when there is more data, using form controls or some variations of dynamic charts is important to keep the charts clean in order to get the right message out.
DMurphy says
Jon: Another topic which could start many “ping-pong” comments.
If a picture tells a thousand words, what do these pictures tell us? Firstly, that the wrong question is being asked, I think, so I’m glad you didn’t ask us to try to come up with alternative solutions – although I must confess to having given some thought to a “heat chart” option!
Charting of Product Sales is probably one of the most common requests to analysts – sometimes the requestor knows what (s)he wants, but many times they do not – and this is probably as good an example of the latter as we will see.
From this data, some of the obvious questions I would ask (and doubtless there are many more) would be:
* Rank the Top-10 selling items for this week and show me their relative position to Last Week/Month/Year?
* How did Sales of ItemX compare to ItemY over time?
* How often has ItemX been in the Top-3?
*etc.
Thanks again for another thought-provoking article.
Colin Banfield says
Jon, this is indeed a challenging analysis. One issue with the line chart is that it highlights the top item per week, without a view of the other top nine (given that this was the stated requirement). Indeed, item L, which is in the top ten only in the first week, is highlighted in the other weeks as well. However, a more fundamental problem is this: Is this an ongoing concern or are the three weeks shown the only weeks in the entire year that the top ten items are required? It’s hard to believe that in any business, the latter would be the case.
As an onging concern (up to 52 weeks in a year), how should the problem be tackled? One possibility is to break down the analysis into four-week intervals, and display the top ten items per week in four bar graphs. A user control would allow selection of the week range, updating the charts accordingly. No doubt, there are other ways the tackle the problem.
Jon Peltier says
Colin –
I was provided with six weeks of dummy data, and it is intended to be extended to 52 weeks. I think top ten was something they thought of because whatever amount of products per week times 52 weeks is just too much for a column chart. The fact that Item L dropped to zero the week after it placed first is more important (to me) than seeing that it fell out of some arbitrary Top N list. As if I’d even notice this in the column chart.
There are many ways to make the output more flexible than I’ve shown in this simple demo:
– Chandoo’s suggestion to sort by a given week
– The ability to highlight (“brush”) a group of lines
– The ability to adjust the time scale (start, stop, duration)
– Plotting of cumulative totals, not just weekly values
– Aggregating items by category (e.g., A and B are peanut butter, C through F are jams, G to I are bread, etc.)
Chandoo says
@Jon:
“There are many ways to make the output more flexible than I’ve shown in this simple demo:”
one more thing that comes to mind: using sparklines to show weekly sales. This takes less space and provides quick glance. That leaves the space for important analytical questions like: “which products are top 10 on week n?”, “what are the top n products?”; “which products performed badly?”
Oliver Jennings says
Jon,
Is it possible to do a chart like this one here from NYTimes, using any type of data metrics?
New York Times: How Different Groups Spend Their Day
http://www.nytimes.com/interactive/2009/07/31/business/20080801-metrics-graphic.html
OJ
Jon Peltier says
Oliver –
Sure, you would make a stacked area chart (or stacked 100% area chart). You can’t use real times since the X axis of an area chart only recognizes integer numbers. Instead you would use convenient X axis units (say 1 = 1 minute) and apply a dummy axis, or just use 1440 rows of data (1440 minutes/day) and in the first column leave blanks everywhere except where you want labels to appear.
You could even handle some sort of interactive features usig chart events as I’ve described in a number of posts.
David Gerbino says
Jon,
call me dense but….
– What is the purpose of this ranking/charting exercise?
– What insite will one get from charts of the top ten sales of 14 products by week?
– Are any products related?
– For example, Does a sale of Product A lead to sales of Product B?
I know you were asked to do this but I can not see the value in it.
If I had all this sales data, I know I would like to know what is the ratio of sales to new customers vs existing customers and a whole bunch of other metrics but without knowing the business need of this data I find it difficult to think of visual means to display the data.
@dmgerbino
Jon Peltier says
David –
I think the purpose that was originally put forth was to do something a manager asked for; making a manager happy has some value, not necessarily business value. I did come up with a methodology, but then dismissed it as confusing and unenlightening.
I proposed another method to track sales of all products, so at least one could find trends across the time periods. Sometimes giving your manager something different but better, especially if the manager imagines they helped guide its development, also has value.
I have to admit, since someone emailed me out of left field asking for free advice on how to make the chart their manager wanted, I was more interested in steering them toward something more useful. I was not expecting to turn it into a larger project with more flexible exploration capabilities. This was simply a quick and dirty example of an improved approach.
If there had been more information (and greater interest), including a breakdown of products by type or other factor or details on the sales, etc., I may have been inclined to set up a worksheet to feed a pivot table, to allow grouping, sorting, and other analyses.
David Gerbino says
Jon,
I thought that might be the case, something a manager asked for. I always find it difficult to work on these types of projects.
My favorite example request was trending data over time using pie charts. There was no business value but a boss wanted exactly what he requested and his peers all thought it was good. I do not do that anymore.
Anyway, I was curious about how all this came about. Thanks for the detailed answer.
– Dave
Jon Peltier says
I think I was lucky as a corporate drone, in that my managers would respect the way I displayed information. Looking back, it wasn’t always done well. But it was better than the norm, and it was recognized as such.
In fact, the worst manager I ever had (who repeatedly asked me to define standard deviation, and he had a PhD) appreciated my displays the most.