In Can You Improve this Graph Showing Suicide Rates in Japan?, Nathan of FlowingData shows a chart of long-term unemployment rates and suicide rates in Japan. The chart comes from Suicide Epidemic in Japan.
What’s wrong with this chart (which I’ve reproduced below)?
- Axis label is faulty. “Japan and Suicide Rate” should be “Japan Long Term Unemployment and Suicide Rates”.
- Rates are not defined. Suicide Rate is number of suicides per 100,000 population. Long Term Unemployment Rate is percentage of total unemployed who have been unemployed for over twelve months.
- Frequency of data is different. Using markers as in my version, it is clear that suicide rates are reported with much less frequency than long term unemployment. At least they didn’t use smoothed lines*.
- Such unrelated rates should not be plotted on the same axis. This chart makes it look like the rates were about the same in 1980, then unemployment dropped below suicides, then it rose above suicides, then both remained the same from about 1992 through 2000.
*Formatting the chart with smoothed lines and no markers gives the reader no indication that the data is reported on a different time scale, and may make it look as though the correlation is even closer.
The first step is to address the overlapping scales. The chart above using one scale is misleading. I can add a secondary axis, and adjust the relative scales of the two series to mislead the reader in any way I want.
The only way to make the scales free from confusion is to plot he series on completely different, non-overlapping scales. I’ve split the chart into two panels to show the series separately on their own scales. It actually looks like the rise in suicides is leading the rise in long term unemployment. If I thought higher unemployment led to increased suicides, I’d expect unemployment to lead.
If we’re looking for a relationship between the two variables, we should plot them on an XY chart. I’ve labeled the points with year, so that one can trace the evolution of the relationship. If I do the math, I get a correlation (R²) of 0.60, not a very strong relationship. By eye, I can see two regions in the chart.
- Up to 1995, when both suicides and unemployment was low, there is a very good negative relationship between suicides and unemployment. Are the number of suicides great enough to reduce the number of long term unemployed?
- From 2000 onward, there seems to be a constant suicide rate despite the large increase in long-term unemployment.
Six data points isn’t much to base a hypothesis on, so I made a (probably invalid) attempt to rectify this, by interpolating suicide rates for the in-between years with no data. The chart looks very much like the one above; I’ve used filled markers to denote actual data, and unfilled for interpolated points.
The correlation plot looks pretty much the same, a bit more convoluted. I did not bother calculating correlation coefficients, as that would stretch the validity of this exercise beyond my comfort zone.
derek says
I can add a secondary axis, and adjust the relative scales of the two series to mislead the reader in any way I want.
That is the problem with secondary axes. If this data set had a better correlation, you could use the linear least-squares fit to assign a non-arbitrary set of values to the primary and secondary axes, such that the means coincided and the scales were in proportion to the slope of the trend line.
As well as being non-arbitrary (so you can’t be accused of cherry picking), this has the extra advantage of being the scaling that produces the strongest visual impression of correlation. It is, after all, the fit with the least square deviation. So it’s a deceiver’s dream! Fortunately, the scalings also look really suspicious with their odd decimal fractions.
Kaiser Fung of Junk Charts describes a similar technique in “The eyeball test”.
Juan Orozco says
Good analysis on the Japan unemployment vs. suicide rate. One test I normally run is the F test. The R2 just says what % of the error (sum of squared errors) are explained by the regression. But we may be biased to reject lower R2’s (or the underlying regression) in large datasets where, in fact, they should be considered. Or accept R2’s in small datasets, when they should be rejected. A better test is the F-test. It answers “what are the chances of this relationship being a random behavior”. As we know, in statistics, we can reject or accept a relationship, depending on our level of confidence. So if the F test returns 99%, and your confidence level is 95%, you can accept there is a linear relationship.
Andrew Grimes JFP, JSCCP says
I am a JSCCP clinical psychologist and JFP psychotherapist working in Japan for over 20 years. I would like to put forward a perspective on some of the main reasons behind the unacceptably high suicide numbers Japan
Mental health professionals in Japan have long known that the reason for the unnecessarily high suicide rate in Japan is due to unemployment, bankruptcies, and the increasing levels of stress on businessmen and other salaried workers who have suffered enormous hardship in Japan since the bursting of the stock market bubble here that peaked around 1997. Until that year Japan had an annual suicide of rate figures between 22,000 and 24,000 each year. Following the bursting of the stock market and the long term economic downturn that has followed here since the suicide rate in 1998 increased by around 35% and since 1998 the number of people killing themselves each year in Japan has consistently remained well over 30,000 each and every year to the present day.
The current worldwide recession is of course impacting Japan too, so unless very proactive and well funded local and nation wide suicide prevention programs and initiatives are immediately it is very difficult to foresee the governments previously stated intention to reduce the suicide rate to around 23,000 by the year 2016 being achievable. On the contrary the numbers, and the human suffering and the depression and misery that the people who become part of these numbers, have to endure may well stay at the current levels that have persistently been the case here for the last ten years. It could even get worse unless even more is done to prevent this terrible loss of life.
The current numbers licensed psychiatrists (around 13,000), Japan Society of Certified Clinical Psychologists clinical psychologists (16,732 as of 2007), and Psychiatric Social Workers (39,108 as of 2009) must indeed be increased. In order for professional mental health counseling and psychotherapy services to be covered for depression and other mental illnesses by public health insurance it would seem advisable that positive action is taken to resume and complete the negotiations on how to achieve national licensing for clinical psychologists in Japan through the Ministry of Health, Labour and Welfare and not just the Ministry of Education as is the current situation. These discussions were ongoing between all concerned mental health professional authorities that in the ongoing select committee and ministerial levels that were ongoing during the Koizumi administration. With the current economic recession adding even more hardship and stress in the lives its citizens, now would seem to be a prime opportunity for the responsible Japanese to take a pro-active approach to finally providing government approval for national licensing for clinical psychologists who provide mental health care counseling and psychotherapy services to the people of Japan.
During these last ten years of these relentlessly high annual suicide rate numbers the English media seems in the main to have done little more than have someone goes through the files and do a story on the so-called suicide forest or internet suicide clubs and copycat suicides (whether cheap heating fuel like charcoal briquettes or even cheaper household cleaning chemicals) without focusing on the bigger picture and need for effective action and solutions. Economic hardship, bankruptcies and unemployment have been the main cause of suicide in Japan over the last 10 years, as the well detailed reports behind the suicide rate numbers that have been issued every year until now by the National Police Agency in Japan show only to clearly if any journalist is prepared to learn Japanese or get a bilingual researcher to do the research to get to the real heart of the tragic story of the long term and unnecessarily high suicide rate problem in Japan.
Useful telephone number for Japanese residents of Japan who speak Japanese and are feeling depressed or suicidal: Inochi no Denwa (Lifeline Telephone Service):
Japan: 0120-738-556 Tokyo: 3264 4343
Andrew Grimes
Tokyo Counseling Services
http://tokyocounseling.com/english/
http://tokyocounseling.com/jp/
http://www.counselingjapan.com
Jon Peltier says
Andrew –
Thank you for sharing your unique perspective. In addition to the economic factors, are there societal factors underlying these high rates? Thinks like perhaps, shame for losing one’s job, shame for needing psychiatric help, and “saving face” through suicide? These are things westerners have heard of, but we don’t know the extent of their truthfulness and impact on the situation.
Jonathan says
Hi, I’m trying to produce a panel chart with market size in the lower panel and market growth (year on year) in the top panel. It’s all fine bar the custom number format for the left hand side y-axis. I have the scale set from 0 to 10,000 and want to display numbers from 0 to 4,500 in this format. Trouble is when I use this customer number format:
[4500]#,,;#,##0;
I get what I want except that the instead of 0, it displays as: 0,000. If I use this formula instead:
[4500]#,,;
I get the right format for the numbers including 0, but all the 4 digit numbers collapse and display across 3 lines.
Any thoughts? (I’m actually using microsoft graph 2003 in powerpoint if that helps).
Thanks
Jonathan says
Actually I think I have it now using this formula:
[4500]#,,;0
more by luck than judgement and help from a colleague! Thanks anyway; keep on charting you chart god, you!
Jon Peltier says
Maybe [<=4500]#,##0;;;
Helen says
I’m very excited to create a panel chart with different scales using your tutorial. However, I’m having a problem and I wonder if you know the answer.
I’ve stepped through your example multiple times. It works great until I get to the step of moving the first new A axis series, specifically: “The new XY series is totally misaligned from the line chart series….. double click on the series, and on the Axis tab, choose primary.’
When I do this step, I do not get the result your tutorial shows. Instead, the chart is reformatted so that all of the A, B and C Plot series are oriented vertically along the y-axis (at x=0). When I look at the x-axis range, it has been reformatted to a huge min and max range (1900-1987), when I correct the min to 1/3/2007, the XY axis series data is correctly aligned, but all of the A, B, and C plot data disappears.
Do you know the solution to this problem?
Thanks for all of your help and tutorials!