## Tax the Rich, or Deceptive Axis Scales

In Daily Chart: Tax the Rich to Pay For Health Care? Conor Clarke responded to a proposal to pay for health care reform by taxing the rich. He plotted the variation in the effective Federal tax rate paid by the top 1% of households to put into perspective the effect of a few additional percent added to the taxes of these high earners. I have downloaded the data from the Congressional Budget Office web site and reproduced Clarke’s chart below.

In Stupid Chart of the Day, James Joyner points out that Clarke’s chart starts from 28%, not from 0%, and thus it is deceiving. (I don’t think it’s really that stupid a chart, but maybe Joyner is going for more pageviews.) In Charts can be Deceiving E.D. Kain followed up on this observation, and produced his own chart of the data; I have reproduced Kain’s chart below.

So which chart is right, and which is deceiving? Well, both plot the data correctly, in terms of not using wrong data. But both have built-in flaws which distort the reader’s interpretations.

Using an axis that starts from zero is important in bar (and column) charts, because the visible length of the bars is what the reader sees and relates to the values. If the bars are truncated, their true length cannot be known, and the user is misled. Since the region under the line of an area chart is shaded, it implies that the area is the important encoding feature in the chart. Thus, starting the axis above zero is misleading for area charts as well as for bar charts.

I don’t think Clarke was attempting to mislead. When you enter his data into Excel and draw an area chart, Excel decides to start the axis at 28% by default, and Clarke simply made no attempt to change the default axis scale. (He did adjust the default fill by introducing a gradient, but the judges are not concerned with artistic impression here.)

And what of Kain’s chart? How is that chart deceptive? Let’s ignore the irrelevant two decimal digits he has added to his Y axis labels. Kain has started his Y axis at 0%, according to best practices for bar (and area) charts. He has made the Y axis maximum 100%, however, which compresses the data into the bottom portion of the chart. The variability in the data is dwarfed by the magnitude of the Y axis.

In any case, a line (or XY/scatter) chart has no implicit requirement to start at zero, since the position of the data points is what encodes their value.The benefit of using a line chart is that you can match the Y axis scale so that it spans from a little bit below the lowest data point to a little bit above the highest.

The 28% default minimum assigned by Excel isn’t even far enough from zero. I’ve used a minimum of 30% and a maximum of 37% on the Y axis of my line chart. This shows the steady decline in tax rate since the mid-1990s, but in no way implies that the tax rate now is 1/6 of it’s peak value.

## Integer Values on Line Chart Category Axis

There has been a lot of discussion here lately about XY and Line charts:

Many of these posts have described differences between XY and Line charts. One difference that’s been mentioned but not examined is that an XY axis recognizes the complete numerical value of its X data, while a line chart only recognizes whole number (integer) categories. Even if you use a Date-Scale X axis in a line chart, the dates are treated as categories, and you can only plot a point on a date, not on a time between midnight and midnight.

To investigate the behavior further, let’s look at fractional data, that is, dates with times. This data set contains a few points per day (midnight, noon, 6pm; not uniformly spaced) over several days. [Read more…]

## Order of Points in XY and Line Charts

There has been a lot of discussion here lately about XY and Line charts:

One interesting thing about a line chart with a date scale X axis is the order of the plotted points. Consider a data set like this, in which the points are out of chronological order.

The line chart internally sorts the data by date, and connects the points in this order, while the XY chart connects them in the order they appear in the sheet.

Here are the same charts as above, with labels on each point indicating the order of each point. Points 1 and 3 of each point are the same, but the other points have different index values depending on the chart type, because the points were out of date order in the worksheet.

I used the following simple procedure to number the points in a series. Select a series, then press Alt-F to pop up the macros dialog (or access the dialog through Tools menu > Macros). Select the macro name from the list, and press run.

Sub NumberThePointsInASeries()
Dim iPt As Long
If TypeName(Selection) = "Series" Then
For iPt = 1 To Selection.Points.Count
With Selection.Points(iPt)
On Error Resume Next
.HasDataLabel = True
.DataLabel.Characters.Text = CStr(iPt)
On Error GoTo 0
End With
Next
Else
MsgBox "Select a series and try again.", vbExclamation
End If
End Sub

## Line Charts vs. XY Charts

In Line-XY Combination Charts I showed how to make a combination Line-XY chart. It is probably important to discuss the differences between line charts and XY charts. The documentation is not clear, and the names of the chart types are not helpful; in fact, they lead to confusion.

The icons in the Chart Type dialog and in the Chart Wizard do not help to clarify the situation. These are the icons for line charts and XY charts in the Excel 2003 dialogs; the Excel 2007 icons are not substantially different.

It is important to note that:

• The formatting options of XY chart series and line chart series are identical.
• If you want markers connected by lines, you DO NOT have to use a line chart type.
• If you want markers without connecting lines, you DO NOT have to use an XY chart type.
• XY and line charts treat X data differently and thus have different X axis styles.

After selection of a chart type, the user is presented with a set of choices, and the default sub-type for the selected chart type is highlighted. Not all available formatting for these chart types is available through the Chart Type dialog. [Read more…]