Another approach to plotting gaps in Excel charts

Constable Henry: Just a moment, sir. I know everyone in this neighborhood, but I’ve never seen your face before. Can you account for yourself?

Freddy: Yes, I am Dr. Frederick Fronkonsteen, newly arrived from America.

by The FrankensTeam

This guest post is from the E90E50 FrankensTeam, i.e., Roberto Mensa, Kris Szabó, and Gábor Madács. The team has done a great deal of work with Excel and its graphics, and they’ve shown how many types of “advanced” graphics can be accomplished right in Excel. Such FrankenCharts include heatmaps, treemaps, network diagrams, and a number of other creative displays. Oh, and they’re as crazy about the movie Young Frankenstein as they are about Excel.

While I was reading Jon’s work using VBA Approaches to Plotting Gaps in Excel Charts I started to think, because I felt there had to be another solution which was waiting to be discovered…

Jon provided two different VBA routines as a solution for the problem: one changes the chart series formatting, the other changes the underlying data. As he clarifies at the very beginning, the disadvantage of using VBA routine is that it must be re-run every time the data changes.

Some time ago I realized that it is possible to use union of ranges in chart series. You can do it by using the union reference operator in the SERIES formula of the chart or via named formula. (The union reference operator is the same as the formula separator character, which is a comma in US-English versions of Excel.) In the project where it came up it was not particularly interesting, so I just stored this fact in my mind.

Now, a few months later, thinking about the gap problem… Eureka!

The union of separate ranges could be a solution, especially if we can build it up using the UDF within a named formula!

I was very excited by this possibility, so I quickly put together some code and shared the idea with Jon, who suggested some modifications in the UDF. Then we did a lot of tests with Kris and Gábor and corrected the problems showed up while we tested the solution in different Excel versions.

Let us explain how it could work.

Below is a simple line chart plotting data from column B: the #N/A error value is not plotted (no marker) but the line is continuous, while the string is plotted as zero.

The series formula of the chart looks like:

=SERIES("ref to rng",,line_chart!$B$3:$B$13,1)

Where y values come from a continuous range: $B$3:$B$13

The idea

To plot gaps on the chart, you will need empty cells in the range used for y values. The union of ranges idea is very simple: we use the cells containing the number values from the original range (column B) but use empty cells from another column (column C) instead of the error or text values.

You can easily select the non-continuous range by mouse while holding the Ctrl key down.

On the below picture you can see how it looks from the Edit Series dialogue box.

And this is how the SERIES formula shows the union:

=SERIES("ref to rng",,
    (line_chart!$B$3:$B$5,line_chart!$C$6,line_chart!$B$7:$B$9,line_chart!$C$10,
    line_chart!$B$11:$B$13),1)

Note the y range is now surrounded by brackets, since it is composed of multiple areas.

You can easily define a named formula for the above range:

Name: y_rng_union
Refers to: =$B$3:$B$5,$C$6,$B$7:$B$9,$C$10,$B$11:$B$13

And use it as Series values:

Formula solution

To make a solution a bit more dynamic, instead of using static ranges, you can put together a dynamic solution with IF formulas. You have to check the cells one by one, and if the cell has a value to be plotted, then use the reference of the cell, else use an empty cell from another column. Then you separate the IF formulas by comma (or semicolon) to create the union of unique cells.

This way the chart will automatically be updated if the data changes, but if you have more data to be plotted, you will have to update the named formula manually.

For example you can put a similar formula to the name, supposing that column C is empty, so use the cells of this column:

Name: y_rng_union
Refers to: =IF(ISNUMBER($B$3),$B$3,$C$3), IF(ISNUMBER($B$4),$B$4,$C$4), IF(ISNUMBER($B$5),$B$5,$C$5), ..., IF(ISNUMBER($B$13),$B$13,$C$13)

Or you can use only one empty cell (for example C3) instead of the cells of column C:

Name: y_rng_union
Refers to: =IF(ISNUMBER($B$3),$B$3,$C$3), IF(ISNUMBER($B$4),$B$4,$C$3), IF(ISNUMBER($B$5),$B$5,$C$3), ..., IF(ISNUMBER($B$13),$B$13,$C$3)

More generally, you can use a name (for example x) referring to an empty cell:

Name: y_rng_union
Refers to: =IF(ISNUMBER($B$3),$B$3,x), IF(ISNUMBER($B$4),$B$4,x), IF(ISNUMBER($B$5),$B$5,x), ..., IF(ISNUMBER($B$13),$B$13,x)

If you are interested in the above formula solution, you may also find useful a help formula in an Excel range to build up the above (very long!) formula.

The logic is simple: A copy-down formula will concatenate together all the necessary IF(ISNUMBER(...) parts.

(Please note, if you use non-english versions of Excel, you will need to replace IF and ISNUMBER with your local language formula names. Also, you will have to replace the separator string (comma) if your version uses semicolons.)

For example, supposing that the header of your data is in B2, and data starts in B3, write this formula in cell M3 and copy down:

=IF(ROW(A1)=1,"=",M2&",")&"IF(ISNUMBER("&CELL("address",B3)&"),"
&CELL("address",B3)&",X)"

In our example we have to copy down this formula to cell M13 (since data ends in row 13). The string we get in M13 is exactly the formula we need to use in y_rng_union, so we copy M13 cell and paste as value. Now the formula could be copy-pasted as a reference into the name:

For better understanding, you can check these solutions in the without-udf example file.

Obviously the formula solution is not really effective and has limitations, but it could be an alternative for those who can not use VBA.

General solution with UDF-based named formula

For a more convenient resolution we will need some VBA code: a user defined function will build up the union-range and provide it to the chart, nested in a named formula. So if your data changes, the union range will change accordingly and if your data range grows, the chart will also capture new data.

Important to highlight that it will happen automatically, together with calculation, without any user interaction!

Here is the UDF code:

Function serie_with_empty(rDataIn As Range, _
        Optional cOffset As Long = 1)

    Dim rCellIn As Range
    Dim rCellOut As Range
    Dim rDataOut As Range

    Application.Volatile

    For Each rCellIn In rDataIn.Cells
        If Not IsNumeric(rCellIn.Value) Then
            Set rCellOut = rCellIn.Offset(, cOffset)
        Else
            Set rCellOut = rCellIn
        End If

        If rDataOut Is Nothing Then
            Set rDataOut = rCellOut
        Else
            Set rDataOut = Union(rDataOut, rCellOut)
        End If
    Next

    Set serie_with_empty = rDataOut

End Function

The procedure loops through the data range (first argument) and checks for non-numeric values. If the value is numeric (so to be plotted) then the cell itself is added to the union-range. If the value is not numeric, then another cell determined by offset will be added to the union. At the end, the function returns a range object.

This range object could not be used directly on the chart, it must be wrapped in a named formula.

To avoid errors when UDF could not be run (because macros are disabled) we suggest using an error check in the named formula:

Name: y_rng_union
Refers to: =IF(TYPE(serie_with_empty(rng,0))=16+NOW()*0 ,rng, serie_with_empty(rng,1) )

Where rng is the original data range, serie_with_empty(rng,1) is the UDF-calculated union-range.

The second argument of the UDF is the column offset parameter. Above we use 1 supposing that the column next to the original data is empty. You have to ensure a sufficient number of cells in this column. The offset parameter is optional with 1 as default value.

You can find more details about the logical test part of the IF formula under section Further details.

Using the above defined name on the chart, the result is correct with gaps, and refreshes as your data or underlying range changes.

The method could be used for xy scatter charts too. For this chart type only the y values need to have empty cells for gaps, so no need to change the original x values range. As an example here is a possible SERIES formula of an xy chart:

=SERIES("ref to y_rng_union" , line_chart!$D$3:$D$13 ,
   Serie_with_empty_udf_xy.xlsm!y_rng_union , 1)

with the arguments:

x values (normal range): line_chart!$D$3:$D$13

y values (UDF calculations): Serie_with_empty_udf_xy.xlsm!y_rng_union

where line_chart is the name of the worksheet containing the x values, and Serie_with_empty_udf_xy.xlsm is the name of the workbook.

You need to reference the name y_rng_union to the worksheet or workbook. I usually start my chart with a regular range in the formula, so I leave the worksheet name and exclamation point intact, and replace the address with the named formula, like line_chart!y_rng_union. If necessary, Excel will change the worksheet name to the workbook name.

Below you can see what our example xy chart looks like, illustrated with ranges as plotted.

We created two example files, one for line chart, another for xy chart. We used random data, so after pushing F9 you can see how data, ranges and chart changes.

Further details

We need to test if the UDF was run or not.

In practice, macros are usually disabled in the Trust center under Excel Options. When you open the file, Excel will try to refresh the chart, but the UDF code will be unable to run, so the range will not be correctly built. The chart will raise an “Invalid reference” error message. Unfortunately, after enabling the macros, the chart will not be refreshed, because it remembers that the reference is invalid.

If the UDF can not run it results a #NAME error, which is number 16 in the TYPE formula. In this case the IF formula will give the original range, so the chart will get a valid range – although the line will not have gaps. To see the correct gaps, you need to enable macros and trigger calculation (so push F9). We use a volatile formula in the name ( NOW()*0 ) this is necessary for the chart to be updated after calculation. (The chart ignores the Application.Volatile in the UDF code!)

Problems with Cell.Text property

The base concept of gaps on the chart is to plot only numbers. In the UDF,

If Not IsNumeric(rCellIn.Value)

tests the cell’s .Value property. We were intending to provide another possibility for the users to mark unplottable data using formatting. For example they could set a custom formatting to hide 0 values if these should not be plotted. In VBA Cell.Text property could be used to get cell’s text as displayed on the screen, but using .Text property caused Excel crashes in case of xy charts. In case of line charts we have not observed crashes, but the SERIES formula was not visible on the formula bar, so we stopped using the .Text property.

About the authors:

Roberto Mensa, Kris Szabó, and Gábor Madács are three Excel-enthusiastics from Europe, thrilled to push the boundaries of Excel and do what seems to be impossible. You can find more about them on their team-site.

Peltier Tech Chart Utility

Guest Post: The ‘Fiscal Cliff’ Explained in Charts: A Critical Review

This is a guest post from Jon Schwabish, an economist and data visualization creator. You can reach him at jschwabish@gmail.com or by following him on Twitter @jschwabish.

The ‘Fiscal Cliff’ Gallery

For a few weeks now, The Washington Post has posted a gallery of charts to help explain what is meant by the ‘Fiscal Cliff’, a series of tax increases and spending cuts that is scheduled to occur at the beginning of next year. The full slide show can be found at The ‘fiscal cliff’ explained in charts.

There are 7 charts in the collection, and although most of them are simple bar charts, from a data visualization perspective, they fail in various dimensions.

So I decided to issue myself a personal challenge: Redesign all 7 charts following 5 basic rules.

Fiscal Cliff Chart Redesign Rules

  1. Use the actual data. For all but one of the charts the data were available on the chart itself or from the listed source. For the sixth slide—“Sequester cuts to the FEMA budget”—the data were not readily available, so I eyeballed the values.
  2. Do not overly modify text or labels. Even though some of the labels were shorthand or incomplete, the objective of the challenge was to improve the visualizations, not correct syntax.
  3. Use a single color scheme and font. The gallery is a collection of charts from multiple sources—the Tax Policy Center, the Committee for a Responsible Federal Budget, and Post staff—so the color schemes and fonts vary. I chose a single, blue color scheme and used the ‘Corbel’ font throughout.
  4. Only use Microsoft Excel. Because many people use Excel extensively, I wanted to show that creating quality data visualization does not require complex software or knowledge of programming languages.
  5. Apply good data visualization principles. My redesigns incorporate simple strategies:
    1. keep data and labels close together;
    2. deemphasize tick marks, gridlines, axes (i.e., ‘chartjunk’); and where possible, include data directly on the chart.

The Charts: Before and After

Chart 1: Fiscal Cliff Components (Tax Policy Center or TPC).

Fiscal Cliff Components

Fiscal Cliff Components

This was probably the most difficult chart to redesign. On the one hand, you might want to focus on the increase in the average federal tax rate by income quintile (the height of the bars). On the other hand, you might want to focus on the components of the fiscal cliff for each group (the portions of each bar). Combining those, however, clutters the attempt to visualize both goals. For my redesign, I chose to focus on the distributional analysis and sacrifice the imagery of the rising average federal tax rate as incomes rise. The second chart in this set—and the first in the original TPC document—tells me that basic story more clearly anyways. An alternative approach would be to use several small charts (‘small multiples’) and stack them together in some way, but I chose to stick to the spirit of the challenge and only redesign the single chart. Plus, had TPC made a series of small charts, it’s not clear to me that The Washington Post would have picked it up.

As for creating this in Excel, it’s a simple stacked column chart where the intervening series are set to an empty fill. The values of the empty fill series are set so that the heights of each section are equal. The labels on the left side are inserted using separate series added as scatterplots.

Chart 2: Average Federal Tax Rate by Cash Income Percentile, 2013 (Tax Policy Center).

Average Federal Tax Rate by Cash Income Percentile

Average Federal Tax Rate by Cash Income Percentile

I didn’t like the yellow-blue combination here because it emphasized the effect of the Fiscal Cliff on the baseline, which may have been the intention, but I thought it was too much. I also moved the labels right onto the chart and lightened and reduced the number of gridlines.

Chart 3: Size Components of the “Fiscal Cliff” (The Washington Post, based on data from the Congressional Budget Office).

Size Components of the Fiscal Cliff

Size Components of the Fiscal Cliff

This simple redesign including lightening the gridlines, left-justifying the title, and adding the data to the chart. The original chart also appeared to have some sort of vertical gradient, which I found unnecessary.

Chart 4: Marginal Tax Rate Increase under Fiscal Cliff by Income (The Washington Post, based on data from the TPC).

Marginal Tax Rate Increase by Income

Marginal Tax Rate Increase by Income

Similar to chart #3, I lightened the gridlines, deleted the gradient, and left-justified the title. I also simplified the y-axis labels by getting rid of the dollar signs and the thousands units and deleted the hundreds decimal points in the x-axis labels.

Chart 5: Economic Impact of Fiscal Cliff, Zandi Multipliers (Committee for a Responsible Federal Budget).

Economic Impact of Fiscal Cliff, Zandi Multipliers

Economic Impact of Fiscal Cliff, Zandi Multipliers

I lightened the gridlines and the heavy axis lines, moved the legend onto the graph, and deleted the outline of each column. (The outlines add three redundant elements to show the height of each column.) I also used a single blue color scheme; in the context of the Fiscal Cliff, which is inherently a policy issue, using blue and red invokes images of the two political parties, which is not really an issue in this graph. Importantly, I also sorted the data by the Budget Impact, which gives the graph both a more appealing look and a quantitative basis for ordering the groups.

Chart 6: Sequester Cuts to the FEMA Budget (in millions) (The Washington Post).

Sequester Cuts to the FEMA Budget

Sequester Cuts to the FEMA Budget

Simple changes: deleted the dollar signs; left-aligned the title; reduced size of “(in millions)” in the title; added data labels; and reduced the size of the x-axis labels.

Chart 7: Effect on Fiscal Cliff Components on Deficit and GDP (The Washington Post).

Effect on Fiscal Cliff Components on Deficit and GDP

Effect on Fiscal Cliff Components on Deficit and GDP

This chart probably resulted in the largest number of changes, but all of the decisions were so easy to make: sorted the data from smallest to largest; used shades of a single color scheme instead of the menagerie of original colors; moved the legend to the area between the columns; deleted the gridlines; left-justified the title; and put the Totals to the outside of the two columns instead of at the top, which looked terribly cluttered.

Peltier Tech Chart Utility

Excel, Free Rice and MMA

Today I’m happy to present a guest post by Dr. AnnMaria DeMars. AnnMaria is the intelligent and entertaining author of AnnMaria’s Blog about statistics, technology, and work and also of AnnMaria’s Blog on Judo, Business, and Life, which is about “achieving success in business, sports and academics [and parenting] without ever actually having grown up.”

“If all you have is a hammer, everything looks like a nail.”

Hard to believe Bernard Baruch died before computer graphics, because he hit the – um, nail – right on the head. Some days, it seems the world is comprised of people who fit into one of two categories …

  1. “Find Excel not sufficiently sophisticated for real statistics, don’t you agree?” (Said in the same tone as Thurston Howell III – look it up, youngsters!)
  2. Believe every problem can be solved by Excel.

What’s really interesting is that I fall into both of those categories, not just some of the time but almost every day. Let’s start with last week.  My darling daughter fought for the 135 lb world title on Saturday, and a few weeks ago, she started a free rice group. You can read a bit of the story at RondaMMA Free Rice.  In a nutshell (or should I say grain of rice), this is a wonderful site where you can answer questions and for every one correctly answered, 10 grains of rice are devoted to the world food program. Ronda sent free t-shirts, autographed pictures and other swag to fans who were part of her free rice group. Below is one of three Excel charts I did to track the progress at different points in the competition.

Her fans have donated over 20,000,000 grains of rice so far, and God love them for that, but it is safe to conjecture that the average mixed martial arts fan is not a doctoral student in statistics so my challenge was to come up with easy to follow graphics for tracking the results. Even though I am usually using SAS or SPSS all day, I selected Excel for this chart for a couple of reasons. First, the free rice site allows me to download the group data in a .csv file each day, making it easy for me to open in Excel. Second, it is blissfully easy to insert a picture in an Excel chart.

Rice grains in a rice grains chart

Should you have your own Excel free rice group you want to chart (or anything similar) here are the steps.

1. Create your data. In my case this meant having one column with the dates and a second column with the number donated as of that day.

A.  Each day, I downloaded the csv file that had three columns, a userid (A), a username (B) and number of grains donated (C).  I computed the number for that day by entering into a cell =SUM(C2:CN) where N was however many group members had donated as of that day. This gives me the data for one day.

B. In my master file, type the date, copy the sum from the day’s file and use Paste Special to paste the value only.

2. Format the cells. I went to Format, then Cells, then Number and formatted the cells to have zero decimal places and a comma to separate 1,000.

3. Make the chart: Select the Date and Rice Grains columns. Click Charts and select the first option, which is an area chart.

4. Double-click on your chart and the FORMAT DATA SERIES window shows up. It has an option I have never used because I am a “Serious Academic” (Thurston Howell III accent again) and I don’t work for USA Today and insert pictures in my chart. Until now, when I click on the PICTURE tab and choose insert picture.  I also click on the option to TILE PICTURE AS TEXTURE.

Format Data Series Dialog

Because it looks awfully plain with white rice and a white grid, I double-click on it again and choose FILL this time, changing the background color to a pale orange.

Format Chart Area Dialog

I click on the legend and delete it, because it really is superfluous, add a title, and I have the chart to post to track the contest.

One person asked, “It takes 3,500 grains of rice to fill one bowl. If you raise 1,000,000 grains it’s only fed less than 300 people. How much difference does that make?”

Ronda’s answer was, “If you’re one of those people, it makes a lot of difference to you.”

The Champ

As of now, the group has raised enough rice to give a meal to over 5,500 people who otherwise would have gone hungry.

You can see the fight at Ronda Rousey vs. Miesha Tate – caution: graphic violence.

Peltier Tech Chart Utility

Programming Excel 2007 and Excel 2010 AutoShapes with VBA (Guest Post)

Today I am pleased to present a guest post written by my colleague Nicholas Hebb. Nick appears online frequently, commenting on Excel-related blogs and corresponding in the various forums.

Programming Excel 2007 and Excel 2010 AutoShapes with VBA

By Nicholas Hebb

With the release of Office 2007, Microsoft rewrote the drawing tools from the ground up. On the plus side, this meant new shapes, new styles, and the addition of SmartArt. On the downside, they didn’t have time to incorporate AutoShape operations in their macro recorder prior to release. Thankfully, Excel 2010 has added macro support back, but like all macros the code generated is often bloated and relies heavily on the Selection object, which tends to hide the core objects in use. For the most part, the online help is good, but there is some information that doesn’t get explained in detail. This article attempts to provide a basic overview of working with AutoShapes using VBA and touch on some areas that are not covered extensively in the help documentation.

[Read more...]

Peltier Tech Chart Utility

Peltier Tech Chart Utility

 

Create Excel dashboards quickly with Plug-N-Play reports.

 

Site by: Dawud Miracle, Business Coach & WordPress Websites