### Excel Chart Add-Ins | Training | Charts and Tutorials | Peltier Tech Blog

Excel Dashboards

Books at Amazon.com

# Horrid Stuff

I've prepared this page as a response to the Horrid Stuff blog entry in Kaiser's Junk Charts blog.

Kaiser cites an Economist article, and shows the Economist graphic, which presents a clustered floating bar chart that compares concentrations of two pollutants in selected European cities. Kaiser splits the chart into two floating bar charts, using small multiples rather than clustering of bars to show the relative amounts of the two pollutants. Kaiser's claim is that there is some degree of negative correlation between the two measures of pollution.

I thought there was too much variability in the measured concentrations to make such a claim. I've combined Kaiser's two charts into a single chart, using the X and Y axes for the two pollutants, and drawing a box for each city bracketing the upper and lower stated limits of each. The chart is shown below:

To me, the broad rectangles obscure any possible trend. If we had more information comparing NO2 vs. particulates at the same time, we might see more of a correlation, because each rectangle would be reduced to a string of points running (presumably) in a diagonal direction through the rectangle.

Further analysis ought to investigate contributors to pollution, comparing perhaps the ratio of diesel to gasoline engines, or automotive traffic to industry, or even the types of industry in each region. Analysis of weather patterns and their effects on accumulation or dissipation of different pollutants would further "cloud" the issue.

### Follow-Up

Kaiser has followed up the above analysis with Horrid Stuff 2. He analyzed the midpoints of the pollution ranges (the centroids of the boxes I drew above), and decided that there was a tiny negative correlation (-0.03). I think such a low correlation is the product of randomness, some low level atomic vibration which prevents us from reaching absolute zero.

But I had a different idea. What if we assume a correlation, and plot the corresponding diagonals of the boxes I drew before? The two charts below show the positive and negative diagonals of the pollutant ranges. If I had seen either of these charts, I would have been sold on a definite correlation, and would have attributed the differences between cities to such noise factors as weather patterns, geographical influences, or variations in local pollution sources. I'm sure a t-test on the slopes in either case (Ho: slope > 0 or Ho: slope < 0) would prove that the correlations were significant.

In fact, there is not likely to be a causal relationship between these two pollutants. Their levels at any given time must vary according to complicated nonlinear relationships with seasonal and regional variations in what's causing the pollution (industry and transportation) and what's intensifying or alleviating it (weather and geography).