Junk Charts #4 – Puns are dangerous

Design guru Edward Tufte famously lambasted pie charts in The Visual Display of Quantitative Information and went on to say

the only worse design than a pie chart is several of them

While pie charts do have their defenders, the basis for the contempt in which pie charts are held by Tufte and others is that the human eye is far better at differentiating position and length than angle and area.

Circular CDOsSo, I was a little disappointed when a correspondent drew my attention to this rather bubbly chart which appeared on an article by the excellent team at Pro-Publica (click on the chart to see a larger version).

Pro-Publica is an independent, not-for-profit newsroom that specialises in investigative journalism. They have collaborated with the team at Planet Money (one of my favourite podcasts), and have perhaps delved deeper than any other journalists into the arcane world of CDOs, a topic I have touched on a few times here on the Stubborn Mule.

The chart, attributed to Thetica Systems, was used to accompany an article by Pro-Publica exposing the fact that, in their words,

Over the last two years of the housing bubble, Wall Street bankers perpetrated one of the greatest episodes of self-dealing in financial history.

It is a fascinating story, but it would seem that Thetica’s graphics department was carried away with a visual pun on the title of Pro-Publica’s post “Circular CDOs” when they chose to use circles to depict the growth in CDO recycling from 2005 to 2007. It might look pretty, but the circles make it much harder to discern the trend and to compare the four banks. Pro-Publica’s article deserves better.

In the tradition of my junk chart posts, I have produced an alternative visualization of the same data. I am sure that graphic designers could improve on the colour-scheme, but this simple lattice of line charts makes for a much clearer view of the data.

CDO Self-Dealing (2005-2007)CDO Self-Dealing by investment banks (2005-2007)

If this post has given you a taste for de-junking charts, you should also visit the Junk Charts blog for much, much more.

More Informality

Yesterday’s post on informal votes generated a lot of questions, both on and off the blog. One commenter was interested in understanding why there was so much variability in informal votes in New South Wales. It is a good question, and one I do not have an answer to. Presumably demographic differences across electorates (such as varying facility with reading English among non-native speakers) would come into play. But this still leaves open the question as to why the swing in informal votes varies so much across New South Wales. I will have to leave it to you to explore: the table below has the informal vote in all 48 New South Wales seats for your perusal. Let me know if you have any theories!

Division IDDivisionInformal (%)Informal Swing (%)
127Kingsford Smith8.232.92
137North Sydney4.620.9
135New England3.60.63

An email correspondent asked whether it was in fact the 2007 election that was anomalous rather than the 2010 election, so I have also compared the 2010 informal vote to the 2004 election. Interestingly, the uptick in informal votes from 2004 to 2010 is indeed smaller. In fact, Western Australia had a lower rate of informal votes in this election than in 2004. New South Wales still shows significant increases in informal votes in a number of electorates, which helps drive a national trend. Overall, compared to 2004 there does still seem to be something going on with informal votes, but the effect is certainly less marked.

Informal Votes: 2010 vs 2004

I also received various questions about whether correlations could be seen between informal votes and Green votes, whether the increase in informal votes was greater in more marginal seats and so on. Unfortunately, as yet my data mining has not revealed anything of substance. Here, for example, is the increase in the rate of informal votes versus the absolute two-party preferred margin. The regression lines show no simple relationship.

Informal vs 2PP

Informal Vote versus Two-Party Preferred Margin

Comparing Green votes to informal votes is just as unenlightening. That, at least, seems to make sense. While it is reasonable to consider some of the Green vote as a protest vote and some of the informal votes likewise as a protest vote, it may be that in some electorates more voters were inclined to protest by voting Green than informal, or vice versa. This would mean that there would be negligible correlation between the Green and informal swings at the division level.

So, despite my efforts, I am yet to squeeze further insight from the data. Of course I remain open to further suggestions! If you would like to do your own analysis, the current 2010 data is available from the AEC as is past data.

UPDATE: If you sort the table at the top by informal vote, you’ll see that the two electorates with the lowest rates of informal voting were New England and Lyne, the seats of the independents Tony Windsor and Rob Oakeshott respectively!

Also, here is a national table of informal votes (just to avoid being to NSW-centric).

Dress: Informal

While Australia still waits to see which party will manage to scrape into power, the Australian Electoral Commission (AEC) has announced an investigation into the unusually high rate of informal votes. Veteran ABC analyst Antony Green observed that the rate of informal votes was the highest since 1984. Some are attributing the rise to the “Latham effect” following the exhortation by former Labor leader now professional provocateur, Mark Latham, that voters should spoil their ballots to thumb their noses at both major parties.

It will be interesting to see what conclusions the AEC draws, but there is no doubt that the informal votes in this election were significant.There are more votes to be counted and the trends in postal votes may differ somewhat from votes cast in person, but enough of the votes are in to get a reasonable picture of what has been going on. The figures here are based on the AEC data for the House of Representatives as at 23 August 2010. Informal votes rose in every state from the rate seen in the 2007 election, increasing by a margin of between 1.0% and 2.4%.

State 2007 2010 Change

Informal Votes by State (%)

One way to visualize the changes is to plot the informal vote rate in 2010 against that of 2007. The chart below does this at a state level and also adds in a 45 degree line. Points falling above this line (as they all do) show an increase from 2007 to 2010, while points below the line would indicate a decrease.

Informal Votes by State

Aggregating to a state level hides a lot of the interesting detail and can be misleading. For example, the ACT shows the biggest increase in informal votes, but with only two electorates, these figures have less statistical value. A more interesting picture emerges when the changes are shown by division. The chart below groups the changes by state, but plots points for each division*. Once again, 45 degree lines provide a guide as to whether informal voting rates increased or decreased.

Informals by Division (State and National)

Leaping out from this picture is the extraordinarily high rate of informal votes in some divisions in New South Wales. It is also striking that the rate of informal votes has increased in almost every division. At this point, there are only 4 divisions in the whole country (one in Victoria and three in New South Wales) to see the rate of informal votes drop.

It is hard to escape the conclusion that the increase in informal votes reflects a protest vote arising from deep voter dissatisfaction with both major parties. The Greens are pleased with the “Greenslide” they have experienced, but some of their success is likely to amount to the same voter protest, only expressed another way, rather than a permanent shift in commitment to the Greens.

* For the purists, there were changes to electorates between elections, and the chart only shows divisions which existed in both 2007 and 2010. Given changes to boundaries, some of these electorates are, strictly speaking, no longer perfectly comparable, but they have been plotted regardless.

Recognise this?

Last night I was watching the Chaser’s Yes We Canberra (only a day late), and jumped out of my chair when I saw Craig Reucassel corner Tony Abbott to challenge him about his obsession with reducing Government debt. Have a look at this to see why!

Here is the post referred to in the video.

UPDATE: here’s a tweet from Craig on the topic of attribution (or lack thereof):

Infrastructure Bonds

With Australia’s Federal election looming, the opposition has today proudly announced a new policy to fund infrastructure without actually increasing Government debt! What are we to make of this?

It’s hard to determine the details from a media announcement, but based on the text posted by Peter Martin on his blog, it would seem that the idea is to provide tax incentives for entities other than the Federal Government to borrow to fund infrastructure:

Private infrastructure operators and State and Local Governments will be eligible to apply for the concessional treatment.

The way the scheme would seem to work is that eligible projects could issue bonds and investors would receive a tax rebate amounting to 10% of the interest on the bond. So, if you received a $100 interest payment and your earning put you in the top marginal tax bracket, you would pay $45 in tax. Under this scheme, you would only pay $35 in tax.

So, the cost to the Federal Government would simply be forgone tax revenue (and this would be capped at $150 million per annum) and the Opposition believes that the program could support up to $20 billion in infrastructure financing. Presumably, investors currently buying plasma TVs would rush to buy these bonds instead.

Seems like a neat trick, but I have a number of reservations about the scheme.

First, I have argued in the past that the near-hysterical concern about Government debt is overdone. For a start, Government debt in Australia is far lower than in other developed countries around the world. More importantly, the facile analogy that compares Government finance to that of a household budget does not stand up for one very important reason: unlike you or me, the Government is the monopoly issuer of Australian dollars. This changes the game and breaks the analogy utterly.

Second, the opposition’s policy would still involve raising significant amounts of debt, just not issued by the Federal Government. If that debt is all incurred instead by State Governments, should that really be a cause for celebration? After all, unlike the Commonwealth, State Governments do not control issuance of currency, so they really could go bankrupt and indeed, recent history has shown that many of the State Governments are loath to increase their debt levels too significantly for fear of having their credit rating downgraded. What if the borrowers are in the private sector? Well, that would be worse still! Back in March I updated my chart showing private and government sector debt. The debt level we should all be worried about in Australia is private sector debt, which is far higher than government sector debt.

History of Government and Private Sector Debt levels

Third, infrastructure bonds have form. Back in the 90s, the then Labor government introduced an infrastructure bond scheme which also featured tax incentives. Of course, it did not take long for clever investment bankers to work out how to surgically isolate the tax benefit so that wealthy individuals could take advantage of the concession without actually taking on any investment risk. In the end, the whole scheme was shut down, although some of the transactions that were done still survive today. I would expect exactly the same thing to happen with this policy. Any special tax treatment is always a red rag to the tax expert bull.

So, it may sound clever, but to me it does not seem to be sound policy.

Broadband Poll

As a follow up to our guest post on the numbers behind Labor’s broadband policy, here is a quick poll to see whose policy you prefer. Let us know what you think!

The Mule goes SURFing

A month ago I posted about “SURF”, the newly-established Sydney R user forum (R being an excellent open-source statistics tool). Shortly after publishing that post, I attended the inaugural forum meeting.

While we waited for attendees to arrive, a few people introduced themselves, explaining why they were interested in R and how much experience they had with the system. I was surprised at the diversity of backgrounds represented: there was someone from the department of immigration, a few from various areas within the health-care industry, a group from the Australian Copyright Council (I think I’ve got that right—it was certainly something to do with copyright), a few from finance, some academics and even someone from the office of state revenue.

Of the 30 or so people who came to the meeting, many classed themselves as beginners when it came to R (although most had experience with other systems, such as SAS). So if there’s anyone out there who was toying with the idea of signing up but hesitated out of concern that they know nothing about R, do not fear. You will not be alone.

The forum organizer, Eugene Dubossarsky, proceeded to give an overview of the recent growth in R’s popularity and also gave a live demo of how quickly and easily you can get R installed and running. Since there were so many beginners, Eugene suggested that a few of the more experienced users could act as mentors to those interested in learning more about R. As someone who has used R for over 10 years, I volunteered my services. So feel free to ask me any and all of your R questions!

As well as being a volunteer mentor, I will have the pleasure of being the presenter at the next forum meeting on the 18th of August. Regular readers of the Stubborn Mule will not be surprised to learn that the topic I have chosen is The Power of Graphics in R. Here’s the overview of what I will be talking about:

In addition to its statistical computing prowess, R is one of the most sophisticated and flexible tools around for visualizing quantitative data. It can produce a wide variety of chart types, including scatter plots, box plots, dot plots, mosaic plots, 3D charts and more. Tweaking chart settings and adding customized annotations is a breeze and the charts can readily be output to a range of formats including images (jpeg or png), PDF and metafile formats.

Topics covered in this talk include:

  • Getting started with graphing in R
  • The basic charting types available
  • Customising charts (labels, axes, colour, annotations and more)
  • Managing different output formats
  • A look at the more advanced charting packages: lattice and ggplot2

Anyone who ever has a need to visualize their data, whether simply for exploration or for producing slick graphics for reports and presentations can benefit from learning to use R’s graphics features. The material presented here will get you well on your way. If you have ever been frustrated when trying to get charts in Excel to behave themselves, you will never look back once you switch to R.

For those of you in Sydney who are interested in a glimpse of how I use R to produce the charts you see here on the blog, feel free to come along. I hope to see you there!

The Art of Conversation

Have you ever heard the question “Would you like a tea or a coffee” answered with a simple “Yes”? If so, the respondent almost certainly considers their response to be extremely witty. The questioner is unlikely to agree. There is also a high probability that the joker is someone’s Dad…or perhaps a mathematician.

I have to admit to having indulged in this “joke” in my time (more than once), but until recently it had not occurred to me that it in fact reflects a violation of a general principle of conversation. Enlightenment came when I read the seminal 1975 paper “Logic and Conversation” [1] by the philosopher H.P.Grice.

The humour (or lack thereof) of the coffee/tea gag lies in the conflict between the logical truth of the statement and its inappropriateness in conversation. While the statement “A or B” is logically true as long as at least one of A and B is true , in the context of conversation, logical truth is not enough. If you knew A was true and B was false, you would not bother saying “A or B”, you would just say “A”. Moreover, that is what others would expect of you. If I ask you to pass me a hammer, I don’t expect you to pass me a hammer and a spanner. In the same way, if you know you are going to Spain for your holidays, I don’t expect you to say “I’m either going to Spain or Canada”, despite the fact that, strictly speaking, it is a true statement. It is this distinction between simple logical truth and appropriateness in conversation that is the subject of Grice’s paper.

Grice bases his ideas on the notion of the “Cooperative Principle”, which he summarises as the requirement to

Make your conversation such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged.

People have conversations of many types for many reasons: to do business, to gossip, to seduce, to educate, to inform or simply for the pleasure of conversation itself. In every case, conversation involves (at least) two participants and the conversations that work best are the ones that take the needs of all of the participants into account. So it makes sense that a bit of cooperation is the foundation of a good conversation.

Based on the cooperative principle, Grice goes on to postulate a number of “maxims of conversation”. Here are the maxims as he describes them:


  1. Make your contribution as informative as is required (for the current purposes of the exchange).
  2. Do not make your contribution more informative than is required.


  1. Do not say what you believe to be false.
  2. Do not say that for which you lack adequate evidence.


  1. Be relevant.


  1. Avoid obscurity of expression.
  2. Avoid ambiguity.
  3. Be brief (avoid unnecessary prolixity).
  4. Be orderly.

The term “maxim” is carefully chosen as Grice notes that one need not follow all of the maxims at all times, while still being cooperative. The main reason that a maxim could be violated is if it is in conflict with another maxim. An example would be providing less information than required (violating Quantity 1) because you are not confident you have the facts right (and you don’t want to violate Quality 2).

Viewed in terms of Grice’s maxims, the coffee/tea joke is a clear violation of the first maxim of quantity.

As I have already admitted to this particular breach, the obvious question is: have I violated any other maxims? Some who know me well would take the view that, while I may take pains to avoid a violation of either of the maxims of quality, I regularly and flagrantly violate Quantity 2 and Manner 3 and probably Relation 1. I need to learn to stick to the point or risk being branded an uncooperative conversationalist! Or perhaps it’s too late.

[1] Available in the collection “Studies in the Way of Words” by H.P.Grice.

Emissions League Tables

Yesterday’s Sydney Morning Herald featured an opinion piece by Rodney Tiffen on Australia’s sluggish response to climate change. Deliberately provocative, the discussion was framed from the outset in the language of competition:

An international competition in self-righteousness would be closely fought. But Australia must be a strong contender.

Tiffen went on to draw on data from the International Energy Agency (IEA), but got his statistics slightly wrong in the process:

If we restrict the analysis to the most populous 130 countries, those with a population of 3.5 million or more, Australia is the world leader. Only a handful of small countries, especially oil producers such as Bahrain, Qatar and Kuwait, have higher per person emissions.

Australians may be disappointed to learn that we do not, in fact, take home the trophy in this competition. Both the United Arab Emirates and the United States have populations over 3.5 million and have higher per capita emissions than Australia at last count (2007). Nevertheless, coming in third place in this competition, Australia certainly punches above its weight, with per capita emissions running at 4.3 times the world average. Furthermore, as the chart below shows, we have been steadily catching up to the United States over the last 40 years. In fact, to give Tiffen the benefit of the doubt, the most recent IEA data is for 2007, so we may well be ahead of the USA by now.

CO2 emissions 1971-2007 (Source: IEA)

The reason Tiffen looks at per capita emissions is to ward off one common argument for inaction on climate change, namely that China and the United States are the only countries that can make a difference. There is no doubt that these two countries dominate the overall production of emissions. Throwing Canada and Mexico in with the United States brings North American emissions to almost one quarter of the world’s total. Add China and almost half the world’s emissions are accounted for.

Total CO2 emissions for 2007 (Source: IEA)

Nevertheless, if the aim is to attempt reductions in world emissions, Tiffen’s focus on per capita emissions is entirely appropriate. No-one would be convinced if the United States viewed its emissions along State lines, thereby arguing that their emissions were not so big by global standards after all (although, this defence would probably not be much use to California). While countries may be actors on the world stage through their political proxies at climate conferences, emissions are ultimately the product of people (both at home and at work) and not countries. Ranking countries by per capita emissions is thus useful as it gives some indication of where emission reductions may be more readily achieved. The chart below shows the top 25 (big and small) countries in terms of per capita emissions.

Top 25 per capita emitters for 2007 (Source: IEA)

Qatar ranks so high on this scale that it compresses the figures for all of the emitters below it, so here is the chart again with a somewhat truncated scale.

Top 25 per capita emitters for 2007 (Source: IEA)

There are certainly some small countries with high rates of carbon emissions per capita, but looking at a larger scale reveals that developed countries are the worst in per capita terms. It is worth noting, though, that Europe is doing better than the rest of the OECD and is also ahead of former members of the Soviet Union.

Per capita emissions by region for 2007 (Source: IEA)

Another useful approach is to consider emissions per dollar of economic output. This serves to highlight “inefficient” emitters, not to shame them but to identify where spending money on the problem is most likely to deliver significant results. It should come as no surprise that a league table of the highest emitters per dollar of gross domestic product (GDP) is a catalogue of troubled and/or small nations. Note that these figures are calculated based on conversion to US dollars using market exchange rates. Using purchasing power parity instead does reorder the list somewhat, but the names are largely the same.

Top 25 emitters by emissions/GDP for 2007 (Source: IEA)

This perspective suggests that when developed countries consider programs to assist developing countries to reduce their emissions, they could reasonably focus on significant but inefficient emitters. The chart below provides a possible target list, showing the 10 worst-performing countries in terms of emissions per dollar of economic output after restricting to countries with emissions of at least 150 million tons of C02 per annum.

Top 10 large emitters by emissions/GDP for 2007 (Source: IEA)

When will Julia go to the polls?

After taking Kevin Rudd’s scalp and now having done a deal with the miners, Australia’s new prime minister, Julia Gillard, is widely expected to call an early poll. The question is, when will the election be held?

As usual, my first inclination is to dig into the historical data. Looking at all of the Federal elections since Federation, December is far and away the most popular month for a poll. Although the election does not even have to be held this year, December is sufficiently far into the future that it fails to qualify as an early election. Unless the bounce Gillard has experienced in opinion polls proves to be extraordinarily short-lived, we should be looking at a somewhat earlier date. Interestingly, both July and August have only seen one election. On the admittedly spurious grounds of historical precedent, September would be a better bet.

Australian Federal Elections by month

But what of other sources of information? At the time of writing, the shortest odds from SportingBet were on August 7. In my own rather modest poll, August is also proving the most tipped month (it’s not too late to vote in the poll…just make your selection in the form below). No-one has voted for a date in July and I am inclined to agree that that is really a bit soon. Nevertheless SportingBet is still showing odds (admittedly long ones) for 31st July.

In a bid for contrarian status, I will diverge from both the bookies and voters in my poll and will tip a September election. But which date? History is not much help there. Of the four September elections in the past, there has been one on the 1st, 3rd, 4th and 5th Saturday of the month (1914, 1934, 1940 and 1946 respectively). So, I will veer as close as possible to the people’s choice of August, while still tipping September and predict that the election will be on Saturday 4th September. In choosing that date, I have not been swayed by the fact that the fourth Saturday of the month has been the most popular historically, other than to nominate 28th August as my fall-back selection.

Since I will most likely be wrong and you probably disagree with me, make sure to vote!