Recovery by Sector of S&P 500 Companies (Guest Post)

Edouard Buning is a Dutch marketing and communications consultant who has recently started up a Dutch blog in the spirit of Chart Busters. Edouard has provided this article for my Peltier Tech Blog readers.

Recovery by Sector of S&P 500 Companies by Edouard Buning

In Bloomberg Business Week of March 8, 2010 a graph on page 17 representing a breakdown by sector of net income in 2008 and 2009 of S&P 500 companies drew my attention. At first the paired waterfall chart seemed organized, but on second thought I realized that it was in fact confusing and could tell us some more than the conclusions drawn by Business Week.

Original Chart - Net Income Breakdown by Sector

Business Week’s conclusion that financials and consumer discretionary rebounded in 2009 after the severe losses in 2008 is difficult to see in the graph although the red bars in 2008 are clearly the first thing that you see, followed by the increase in total net income from $288B in 2008 to $512B in 2009.

The data by sector and by year:

Tabulated Net Income Breakdown by Sector

A short look at these data shows us that there are more interesting conclusions that can be drawn, but first we’ll take a look at the chart to see how we can improve it.

The problem with the published chart is that, in trying to show the addition of a sector to the net income, they placed the addition on top of the net income of the preceding sectors. As long as the addition is positive there is no interpretation problem, but in the case of the sectors financials and consumer discretionary the addition is negative. Although the addition is shown in red, it looks weird. How lucky we are they did not use a pie chart!

A vertical bar chart is already an improvement:

Column Chart - Net Income Breakdown by Sector

To improve the readability of the sector names, let’s flip the graph and reverse the order of the Y-axis:

Bar Chart - Net Income Breakdown by Sector

With a little more formatting, such as adding a title and major gridlines, this panel chart is the result:

Panel Chart - Net Income Breakdown by Sector

Now for the interesting conclusions:

  1. It is clear that financials and consumer discretionary rebounded from a combined loss in 2008 of $147 to a combined profit of $106 in 2009.
  2. But on the other hand, industrials (- $23) and energy (-$68) realized a lower net income in 2009. This is in line with the conclusion at the top of the article on page17, “But some, including Exxon Mobile, actually posted lower fourth-quarter profits than a year earlier.” In another graph (not shown in this post) we see that fourth-quarter earnings per share for Exxon Mobile dropped from $1.58 to $1.33 and for Chevron dropped from $2.58 to $ 1.58 from 2008 Q4 to 2009 Q4.
  3. Telecom and utilities realized almost the same net income in both years.
  4. The other sectors materials, consumer staples, technology, and health care increased their net income in 2009.

About the Author

Edouard Buning is the founder of BuningAdvies, a consulting company in the Netherlands that focuses on how to use market research results in marketing and communication actions. After a 20+ year career in market research and presenting the results of market research studies to his clients he realized that on the one hand client companies were always eager to know all ins and outs of the study, but on the other hand it often happened that the same client company did not use the recommendations nor took any actions based on the study. In many instances this was due to a lack of time, capacity, knowledge, or priority for the follow-up. BuningAdvies offers this follow-up service.

Edouard has recently started a blog on the story behind a chart, Grafiekadvies. This is the first blog in the Netherlands about charting.

 

Peltier Tech Charts for Excel

Sparklines For Excel vs. Excel 2010 Sparklines (Guest Post)

Back in November I came across a new blog, called Data Driven Consulting. On this blog I found articles covering some of my favorite topics, including graphics and data visualization. I snooped around and discovered that the author of the blog, Alex Kerin, lives only 20 miles away from me here in the snow belt of Massachusetts. We got together over lunch to discuss our mutual interests, and the first outcome of our meeting is this post which Alex has prepared for the Peltier Tech blog.

Alex is founder of Data Driven Consulting, an independent consulting company that focuses on obtaining high quality data to facilitate effective business decisions. Data Driven Consulting specializes in a number of areas, including market research, dashboard and data presentation, and authoring of white papers and e-content.

One of the more anticipated functions of Excel 2010 is Microsoft’s implementation of sparklines (as defined by Edward Tufte – data intense, design-simple, word-sized graphics). Although the recent patent application made by Microsoft would seem to imply that they are the inventors of in-cell sparklines, both Edward Tufte and the software vendors who have offered Excel sparkline solutions would beg to differ.

One of these solutions, Sparklines for Excel (SFE), is favored by many, is free and open-source, and works well. There are versions for Excel 2003 (and before), and Excel 2007, with the latter version adding a new section to the ribbon. Below are screenshots of the implementations – click on these reduced images to open a new window showing the full size sparkline tabs.

Excel 14 Native Sparkline ribbon tab
Excel 14 Native Sparkline ribbon tab

Sparklines for Excel add-in ribbon tab
Sparkline For Excel Add-In Ribbon Tab

Given the anticipation of Excel 2010’s sparklines it makes sense to compare the two offerings. This table compares the functionality of the two options.

Sparklines for Excel
Excel 2010
Types of cell charts 14 and growing: Line, Pie, Pareto, Bullet, XY, Scatter, Bar, Column, Stripe…
Three: Line, Column, Win/Loss
Overall look (sparklines)
Some blocking, not noticeable at normal zoom levels

Smoother
Features on individual chart types Excellent, whole range of markers like hi/low lines, max/min points
Limited to some basic additional data indicators, but they work well
Ease of deployment Have to distribute macros with file, some VBA crashes, mostly during creation. Win today, as no built-in option No extra package, but Excel 2010 needed. Win in 2-3 years when 2010 is ‘standard’.
Ease of use More complex, for  example looking up meaning of formula variable, color chart Easier, native to the product, so formatting is quick
Code Customization and Stability VBA macro based (non-locked) so customization possible (and legal) Not VBA based – more stable, but no customization
Grouping Can set max and min so that multiple charts
scale the same
Adjacent charts are grouped automatically making for easy changes of formatting/scales
Ease of editing Charts are shapes created on the basis of formula in the cell Effectively separate charts, edited with own section on ribbon
Future additions Constant addition of new chart types/upgrades to existing
Maybe some fixes, but will likely have to wait until Excel 20XX for new chart types
Cost Free, but you need Excel of course
Free, at least after you’ve paid any upgrade costs

While there are equal numbers of ticks in both columns, there are clear winners depending on your required functionality and how you intend to share the information. For creating dashboards that benefit from different types of in-cell charts and line markers, SFE is still a clear winner – the range of chart types, the continued development, along with the fact that the sparklines are formula based makes for a powerful tool. Distribution is complicated by the need to load the add-in file, but for dashboards that shouldn’t be a barrier.

On the other hand, for a quick summary of data that is emailed around, the native version in Excel 2010 is clearly simpler. In the near term (until 2012 I would guess), SFE is still really the only option anyway as sparklines created in Excel 2010 simply won’t show on earlier versions.

It’s great that Microsoft has offered sparklines – hopefully end-users will appreciate their value more now they are a native part of the program, and become used to sparklines’ inclusion in their data lives.

 

Peltier Tech Charts for Excel

Two Color XY-Area Combo Chart – Guest Post

Today’s post is written by David Montgomery, who has a new blog called David @ Work. David read two of my recent posts, Fill Below an XY Chart Series (XY-Area Combo Chart) and Fill Between XY Chart Series (XY-Area Combo Chart), and noted the lack of an explanation for different colors between XY series that cross.

David attacked the problem in his blog post Two Color XY -Area Combo Chart, and he has graciously accepted my offer to repost his article below.

Two Color XY-Area Combo Chart

Last Wednesday, Jon explained the technique for creating an XY-area combo chart. The original article shows how to color the area between two lines in a chart where the lines never cross. I’m going to assume you read the article and that we’re all starting at this point:

But what if you want to cross the lines? The good news is that this isn’t Ghostbusters and crossing the lines doesn’t break anything, the area between the lines is still filled in perfectly. If we flip the values in B5:C5, B7:C7, and B9:C9, we get the following:

The bad news is that the fill series is only one color and as soon as you show it to your boss she’ll ask why the positive and negative areas aren’t green and red, respectively (which you shouldn’t do anyway because a lot of people are red/green colorblind). Before today, you would have said that it would take too long to add a second color, that it was impossible, or you would have just drawn auto shapes on top of the chart. Today though there is a much easier solution:

  1. Download the workbook at the end of this post.
  2. Plug in your data values.
  3. Profit.

But if you’re like me, and I know I am, chances are you’re reading this blog because you want to learn how to do it yourself. At this time I should note that Excel 2003 and I have a monogamous relationship and that some of these steps may be different for other versions of Excel.

Okay, so the first thing that is obvious is that you’ll need to add another data series if you want an additional color. Creating separate positive and negative data series doesn’t work by itself though; you need your positive and negative shading to end where the lines intersect, but the shading will continue on to the next point on the x-axis with the current configuration.

What we need to do is find the x values at the points where the lines intersect and add them to our secondary x-axis (just a recap, the lines are on the primary x-axis, the shading on the secondary x-axis).

This isn’t as hard as it sounds, but when searching for a way to find the point where lines intersect in Excel I found a lot of people suggesting Goalseek. This deeply offends the math geek in me. Letting machines do math for you is only okay if you told it exactly what to do. If you didn’t tell it what to do, the computer is thinking for you, and that is a slippery slope to Skynet. But I digress…

The common formula for a straight line is y = mx + b, where m is the slope and b is the y-axis intercept. At each point of intersection in our chart, we have two lines:

y1 = m1x1 + b1
y2 = m2x2 + b2

Because we want the point where the lines intersect, y1 = y2 and x1 = x2. Also, because both of the formulas equal y, they are also equal to each other and we can simplify to:

m1x + b1 = m2x + b2

Solving for x gives us:

x = (b2-b1)/(m1-m2)

Slope m1 and intercept b1 correspond with the top line, the red one in the example, and m2 and b2 refer to the bottom/blue line. For now, let’s focus on the first point of intersection where the two bolded line segments cross:

Excel has built in functions for calculating the slope and the intercept for a range of x and y values which they’ve cleverly named SLOPE() and INTERCEPT(). We’ll only need two points to calculate each of our line segments, so our formulas to calculate the two slopes (m1 & m2) between the first and second points will look like this:

Our two y-axis intercepts are calculated in the exact same way, just replace SLOPE with INTERCEPT. We’ll need to calculate the slope and intercept for both line segments at every point where the lines cross. If the lines don’t cross, we can skip the calculation.

Because the lines can potentially cross between any two data points, we’re doubling the size of the original secondary x-axis. Every other point will be one of the original data points, and the in-between points, which are the potential points of intersection, will either be NA() if the lines don’t cross (Excel will skip over #N/A errors when plotting most types of charts) or something like this formula (for cell G5) if they do cross:

=(INTERCEPT(C4:C5,D4:D5)-INTERCEPT(B4:B5,D4:D5)) /
(SLOPE(B4:B5,D4:D5)-SLOPE(C4:C5,D4:D5))

Remember that we’re just replacing the m and b values in x = (b2-b1)/(m1-m2) with the SLOPE() and INTERCEPT() functions. It isn’t as bad as it looks. In this image I’ve added the new secondary x-axis points in column G; the red shaded cells are the original points, and the blue points are the potential points of intersection that we’ve calculated:

Of course the formulas in the workbook are much uglier because I’m counting ROWS() and using the OFFSET() formula so I don’t have to manually add each line segment from the original data set, but it’s a little hard(er) to follow. Another trick to avoid manually entering each formula would be to insert blank cells between your original points in D4:F11. That way the original secondary x-axis and the new one in column G will be the same size and your formulas should copy down correctly.

So to recap, our new secondary x-axis should look something like column G. You still run from 0 to 1000, you still repeat the first and last value, and every other point is one of the original points as calculated in Jon’s original post. The points in between are #N/A errors if the lines don’t cross and the x values we just calculated for the points of intersection if the lines do cross. An easy way to tell if the lines cross?  In column F we had previously calculated the difference between the two original data series. If the product of any two adjacent points in column F is negative, the lines intersect between those two points. You should be able to add an IF() checking this at each potential point of intersection without too much trouble.

The next step is to rebuild the three Area data series. Area1 is the transparent base series, Area2 is the positive series, and Area3 is the new negative series.

To create Area1, every other row should point to an entry in the original series in column E. The in-between points should once again be NA() errors if the lines don’t intersect (same for Area2 and Area3 as well), and when they do intersect we just need to calculate the y value at that point. We already have the x value, so we can plug that into our y = mx + b equation. The formula for H5 would be:

=SLOPE(B4:B5,D4:D5)*G5+INTERCEPT(B4:B5,D4:D5)

You could also use the points C4:C5 for your y values as both line formulas will give the same result with that x value (G5).

Area2 and Area3 are incredibly straight forward. We still use the values in column F for every other cell, but the trick is that we only use the positive points in Area2 and only negative points in Area3, with the rest of the points being zero. This is easy to do with something like MAX(F4,0) for Area1 and MIN(F4,0) for Area3. All of the in-between points are either NA() errors (if there is no point of intersection) or zero. If everything worked correctly, your workbook should look like this:

Now you just need to add these new data series to the chart. Series 1 and 2 are the lines which should already be done, Series 3 (the transparent series) should now point to h2:h20, Series 4 (the positive series) should point to the values in column I, and you’ll have to add a Series 5 (negative) to point to the values in column J. The secondary x-axis values also need to be changed to the values in column G.

If your Series 5 didn’t default over to the secondary axes, you’ll have to switch it. Double click on Series 5, click on the “Axis” tab, and select the “Secondary axis” option.

Finally, double click on each area series, go to the “Patterns” tab, choose the two colors you think are the prettiest, and you’re all set.

If everything went correctly, you should have something like this:

If not, the workbook will make more sense than I ever could:

TwoColorXYAreaChart.xls

And that’s all there is to it. Let me know what you think in the comments.

– David

 

Peltier Tech Charts for Excel

Mike Alexander’s Favorite Chart Labeling Tricks (Guest Post)

DataPig TechnologiesMy colleague and fellow Excel MVP Mike Alexander of DataPig Technologies has sent me another article, so instead of the blog being silent while I take today off, it will be graced with more of Mike’s wisdom. Last November, Mike shared with us his Ten Chart Design Principles.

Mike and I have taught advanced Excel classes together, and next month we will host the Second Annual Excel Dashboard Boot Camp. This will be a three-day extravaganza that will take you through the mechanisms of Excel data analysis and charting for purposes of creating effective dashboard-style visualizations.

Mike has written , which is among the most comprehensive Dummies books I’ve come across. The book is geared towards Excel 2007, but its principles and techniques are valid for any version of Excel. Mike takes his readers through dashboard principles, data preparation and chart creation, advanced data techniques including pivot tables and dynamic ranges, automation of dashboards using VBA, and designing interactive dashboards.

Without further ado, Here’s Mike:


A Few of My Favorite Chart Labeling Tricks by Mike Alexander

Often times, a few basic labeling tricks can enhance your chart labels and help bring out an extra layer of analysis. Here are a few of my favorite ways to enhance my chart labels.

Conditionally Color the Y axis

Did you know you can apply a conditional format to your chart labels based on their values? Using custom number-formatting, you can repurpose the Excel’s number formatting to apply your own conditions. For example, if you highlight a range of cells and go to the number format dialog box, you can click on Custom in the ‘Category’ list Then you can enter in something like [Red][<=200]#,##0;[Blue][>200]#,##0.

This will color any number <= 200 Red, and any number >200 Blue.

Custom number format to apply colors to cell values based on values

Charting this range will give you colored chart labels! You can’t do this with conditional formatting.

This will color any number <= 200 Red, and any number >200 Blue.

Custom number format to apply colors to axis tick labels based on values

Jon explains these number format techniques in detail in Number Formats in Microsoft Excel.

Add layers to your X-Axis Chart labels

Have you ever created a pivot chart and got results like this?

Layered Category Axis Labels

You’ll notice that the X-axis has layered/nested labels. That is, month labels and associated years Well, you can achieve this result without a pivot chart by simply including another column of labels in your source data as shown here:

Data for Layered Category Axis Labels

Embed Data Values into your X-Axis Labels

We all know that you can use Data Tables to show the actual data values for your charts. But there are situations where you may want to show the values for only one series in your chart. For example, take a look at this chart:

Embedded Data in Category Axis Labels

How did I get the Data Table to show only the values for % Labor cost and not People Count? The answer: I didn’t use the Data Table option. I used Excel’s Alt+Enter trick.

If you’re in a cell and you hold down Alt while you hit enter, Excel will place a carriage return character into the cell, forcing your values to show on two lines.

You can take advantage of this behavior in your data labels. As you can see, the source for the Primary X-Axis label is a combination of the Month name and % Labor Cost (separated by a carriage return).

Formula that Embeds Data in Category Axis Labels

Here is a sample of the formula used to get this setup. After you enter this formula, click in the space between the quotes and hit Alt+Enter. This will trigger the carriage return.

Note the use of the TEXT function. This is necessary so that the number formatting (in this case, percentages) is retained after you apply the carriage return. See this link for more info on that: http://www.techonthenet.com/excel/formulas/text.php.

 

Embedded Data and Symbol in Category Axis Labels

Again, this is just a formula trick that concatenates all the values I want to see in my X-Axis label.

Formula that Embeds Data and Symbol in Category Axis Labels

 

Peltier Tech Charts for Excel

Peltier Tech Charts for Excel 3.0