Author Archives: Stubborn Mule

Standard variable rate mortgages

The last post looked at the increasing margins on Australian mortgages and small business loans. On the way is another post that tries to estimate how much the banks’ own margins have been increasing. Interesting though that may be, the real problem with Australian mortgages has nothing to do with whether bank margins are or are not going up. The problem is the product itself. This post explains why.

There was an article in the the Sydney Morning Herald today which explored exactly this issue, pointing out that Australia’s “standard variable rate” mortgage, which is the most common type of mortgage in Australia, is quite an unusual type of mortgage by international standards.

Banks tend to talk about “standard variable rate” mortgages, but a better term used in the industry is “discretionary variable rate”. The problem with Australian mortgages is encapsulated in that word “discretionary”. I can clearly remember almost 15 years ago trying to explain to European and US investors who were considering buying Australian mortgage-backed securities how a discretionary variable rate mortgage worked. The conversations went something like this:

INVESTOR: So, the bank can change the interest rate whenever they like to whatever they like?

ME: Yes.

INVESTOR: Why would anyone ever accept a mortgage under those terms?

ME: Well, it’s the standard product, so people are used to it and in practice the banks tend to just change the interest rates in line with the Reserve Bank cash rate.

INVESTOR: But they don’t have to do that?

ME: No.

Why were these investors so surprised by these sorts of mortgages? It’s certainly true that in many other countries, such as the US and France, the most common type of mortgages have fixed rates, but variable rate mortgages are found all over the world too. The difference is that most of these variable rate mortgages are pegged to some kind of indicator rate that the lender cannot control. Sometimes referred to as “tracker rates”, these mortgages would specify a fixed margin (say 2%) over a benchmark rate. This benchmark may be a central bank cash rate or some other kind of short-term market rate, but the important point is from then on that margin can never change. In contrast, with Australian mortgages, variable rates move up and down with market interest rates, but banks can also tweak the margin over market rates whenever they see fit.

Last year Westpac was pilloried when it tried to use the analogy of a banana smoothie to explain why mortgage rates were rising. It may not have worked for Westpac, but the analogy can help to highlight how strange discretionary variable rate mortgages are. Imagine that the cost of bananas goes up due to a cyclone-induced banana shortage. It may well be that the price of banana smoothies goes up (although it may also be that café owners take a portfolio view of their business, value their customers and absorb a bit of margin compression on their smoothies, but that’s another story). What certainly does not happen is that café owners go around to everyone who has bought a smoothie in the last year, explain to them that bananas are now more expensive and demand that their customer pays a bit more now for last year’s smoothie.

That is essentially what happens with discretionary rate mortgages. You might have taken out a mortgage a few months ago after doing extensive research comparing interest rates and deciding that the best value mortgage you could find was from the Commonwealth Bank as it was 0.1% cheaper than the next best offer (this may or may not have actually been the case). So far Commonwealth Bank is the only bank to have hiked their mortgage rate by 0.2% more than the Reserve Bank and now your “cheap” mortgage is 0.1% more expensive than the bank you turned down. So much for shopping around! Banks may argue that you are free to change to another bank if you are unhappy (although you can expect exit fees, particularly if you received any kind of rate or fee reduction when you first took on the loan). This does not change the fact that it is a rather unusual product that allows the seller to increase their margins after they have done the deal.

This hypothetical example highlights one of the real problems with the discretionary variable rate mortgage. It is inherently anti-competitive. There is little point shopping around for the cheapest mortgage when after next week it may not be the cheapest any more and you are locked in for 25 years. Is it any wonder that most people shrug their shoulders, say that the banks are all as bad as each other, hold their noses and just pick one almost at random?

There is another problem with discretionary variable rate mortgages, as one Mule reader pointed out in an email. It has the surprising effect of creating some credit risk for the borrower. Normally, depositors are exposed to the risk that the bank will fail, while banks are exposed to the risk that the borrower will fail. But, if you take out a discretionary variable rate mortgages, you may end up paying more if the credit quality of the lender deteriorates. The Herald article gave this hypothetical scenario:

Suppose one of our banks got downgraded from a AA to B. What would happen at the moment is they would just increase the margin on their mortgage rates to cover the extra costs they would face, whereas that risk should fall on the management and the shareholders.

But this sort of thing actually has happened! Many of the non-bank lenders like RAMS got into trouble during the global financial crisis and found funding through securitisation difficult, if not downright impossible. Some collapsed or turned to banks for support, but all of them suffered fast rising costs. Many borrowers who took out mortgages with these lenders saw their interest rates go up as a result. Some were able to refinance their mortgages with another lender, but those struggling the most to pay the higher interest rates would also be the ones least able to get refinancing approved.

In my view, abolishing discretionary variable rate mortgages, though unlikely to happen, would be a good thing for the Australian market. There’s certainly no guarantee that margins would drop. But it would change the stakes for banks considering raising rates to preserve their margins. Rather than being able simply to recoup that margin from their existing mortgage book, they would have to seriously consider the impact the move would have on new business, because it would only be new loans that would be paying the higher margins.

Banks, banks, banks

There has been a frenzy of bank bashing in Australia over the last few weeks. The attacks intensified on Tuesday when the Commonwealth Bank decided to raise their standard mortgage rate by 0.45%. As the national broadcaster did not want us to miss, this was almost double the Reserve Bank’s interest rate increase of 0.25%. Politicians have been particularly keen to get into the action, with some peculiar results. One minute shadow treasurer Joe Hockey was pilloried for advocating tighter regulation of banks when supposedly representing the party of free markets, while days later the Commonwealth Bank’s move made him look penetratingly prescient.

Home ownership is a topic close to the hearts of many Australians and it should come as no surprise that, as mortgage rates rise and some borrowers start to experience real financial distress, the actions of banks should come under the spotlight. Unfortunately, very few commentators seem to have a good understanding of how banks operate which means that while there are some good questions being asked (such as why are banks so quick to put the squeeze on the customers who can least afford it while they are turning record profits and paying themselves such generous bonuses), there are also plenty of red herrings cropping up (like the idea that banks are getting a free kick from their offshore borrowing since interest rates are lower overseas).

For a few weeks now I have been contemplating a blog post that attempts to make the mechanics of banking a little clearer. There is too much to fit comfortably in one post, so here are some of the subjects I’ll aim to cover over the next week or so (in no particular order):

  • Are bank funding costs really still going up?
  • If bank lending creates deposits, why do they need to borrow in offshore markets at all?
  • How does offshore funding work and how much does it cost for the banks?
  • Is there a problem with competition in banking in Australia and (if so) what can be done about it?

While I will not get to any of these questions in this post (other than touching on the first), I will give some historical perspective on mortgage rates and other lending rates.

The chart below shows the history of some key interest rates over the last 20 years. The lowest of these is the Reserve Bank cash rate, and coming in at the top is the average rate banks charged small businesses for unsecured loans. Interest rates for small business loans secured by property are somewhat lower. The mortgage rates are based on a simple average of the rates offered by the four major banks on loans for owner-occupiers.

Interest Rates

Australian Interest Rates 1990-2010

Since everyone’s eyes have been on changes in mortgage rates compared to the Reserve Bank’s overnight cash rate, here is a chart showing the difference between these two rates. It is not clear yet which (if any) of the other banks will follow the Commonwealth Bank’s lead in raising mortgage rates by 0.2% over the Reserve Bank move, but for the purposes of this chart I have assumed half the banks lift their rates 0.25% and half 0.45%, thereby pushing the average spread up 0.1% to 3%.

Mortgage SpreadAustralian Mortgage Spread to the Cash Rate 1990-2010

This chart provides an interesting historical perspective. As interest rates began to fall in the early 1990s, banks were slow to push through the reductions to borrowers, thereby building up healthy margins. This helped them recover from a rather painful period for Australian banks. Westpac in particular had come close to collapsing in 1992. Then in the mid-90s, aided by securitisation non-bank lenders like Aussie Home Loans and RAMS introduced new competition to the market, pushing the margins down. Margins were then stable for a number of years. During this period, then treasurer Peter Costello established the political sabre-rattling to keep banks in line, which cemented the idea that mortgage rates should move in lock-step with Reserve Bank cash rate moves. Prior to this, the relationship had not been so stable.

Now, in the wake of the global financial crisis, driven by a combination of increased bank funding costs and the fading of non-bank competitors, the spread to the cash rate has been on the rise once more, although it is yet to reach the levels of the early 1990s. However, as the chart below indicates, small businesses have seen their margins rise even more rapidly. A few commentators have noticed this fact, but most of the indignation of pundits and politicians has been focused on mortgages.

Australian Interest Rate Spread to the Cash Rate 1990-2010

Despite the fact that the link to the cash rate is so well established, the cash rate is not the primary driver of banks’ funding costs. Changes in the rates on bank bills with maturities in the range of 30 to 90 days give a better indication of day to day changes in bank funding costs. On top of that, funding they source from domestic and international bond markets adds a margin on top of these bill rates. Although there is a high correlation between changes in the Reserve Bank cash rate and bank bill rates, the relationship is not perfect. This means that the spread between lending rates and the 90 day bank bill rate (labelled BB90 in the chart below) provides a better indication of changes in bank margins, although it does not capture increases in bond market margins in the wake of  the global financial crisis.

Spreads to bill rates

Australian Interest Rate Spread to 90-day Bank Bills 1990-2010

One thing that this chart highlights is that the strong link to the cash rate in fact introduces quite a bit of volatility in bank margins. Over time this volatility averages out and banks can also use derivatives (primarily “overnight indexing swaps”) to smooth this volatility.

Without taking into account the margins banks face in the bond market, these charts are not enough by themselves to determine whether banks are reasonably passing on rising margins or are simply lining their pockets. That is a question I will return to in a later post.

Data Source: Reserve Bank of Australia.

Thanks to @Magpie for the link to this piece by Christopher Joye which has a detailed discussion of the issue of interest rates for businesses, a topic which generated a lot of discussion in the comments here on this post.

The dangers of prediction

The recent post about Australia’s coal supplies took issue with the convention of quoting coal and other commodity reserves in terms of years remaining at current production levels. The problem is that it is too easy to assume that these figures give a good indication of how long the reserves will actually last, when in fact the chances are they will do nothing of the sort.

In the case of coal, production in Australia has been growing exponentially for some time, while estimated reserves have not changed very much. If this trend continues, the standard “years remaining” figure will overestimate the life of Australian coal reserves. Estimates of other mineral resources, however, have been growing more rapidly than consumption, which means that they may last longer than the standard figures suggest.

Geoscience Australia regularly reports on Australia’s mineral resources. In the 2009 report, there is a table showing economic demonstrated resources (EDR) expressed in the standard “years remaining” format at various points back to 1997. This data highlights the shortcomings of this convention.  The chart below illustrates how the figures for a few of the minerals have evolved over time. In each case the dashed line shows the trajectory the “years remaining” should take from 1997 if each passing year simply reduced the remaining years by one and so falling by 11 years to 2008. This is the path that would be expected if “years remaining” was in fact a reasonable forecast of how long the mineral reserves might last.

EDR Life

Over the space of a mere 10 years, we have gone from having 190 years’ worth of black coal left to only 90 years. This is simply due to the fact that production grew steadily over that time, while reserves did not change very much. In this way the chart gives an alternative perspective on the argument of the earlier post, namely that the 90 year estimate for the life of Australia’s black coal looks optimistic unless production drops or new reserves are discovered at a comparable rate to production growth. Both of these are, of course, possible but the trend does not look encouraging.

The picture is very different for a mineral like nickel, which has managed to extend its remaining life from 55 years to 130 years over the same period. In this case, reserves grew faster than production.

In every case, the minerals quite clearly fail to track a simple year by year remaining life trajectory. Once again the lesson is that it can be misleading to quote mineral reserves in terms of remaining years at current production, without any qualification as to how production or reserve estimates may change over time.

When will Australia’s coal run out?

Coal exports are a growth industry for Australia. A lot is being invested in infrastructure for coal production and transport to keep this growth going. But how long will this bonanza last? After all, there is only a finite amount of the stuff in the ground.

Earlier this year, the Australian Bureau of Agricultural and Resource Economics (ABARE) released an extensive report on Australia’s energy resources. The chapter on coal included the following observation about black coal:

At the 2008 rate of production of around 490 Mt [mega-tonnes] per year the EDR are adequate to support about 90 years of production.

For those unfamiliar with the jargon of the industry,  “EDR” stands for “Economic Demonstrated Resources” which means an estimate of the total amount of coal in the ground that we could feasibly dig up.

Now some of you may already be thinking that 90 years does not sound all that long, but there’s a problem. The authors of the report do not understand exponential growth! The catch is hidden in the apparently innocuous phrase “at the 2008 rate of production”. In other words, to come up with the 90 year figure they are assuming that production levels do not grow at all for the next 90 years. Is that reasonable?

A quick look at coal production over almost 50 years would indicate that it is far from a reasonable assumption.

Coal Production chart

Australian Coal Production 1961-2008

Even to the untrained eye, a growth trend is evident in this chart, a fact which is confirmed by looking at year-on-year growth, which has averaged around 5% and has only been negative three times over the whole period.

Coal Production Growth II (chart)

Annual Growth in Australian Coal Production 1961-2008

So, where does the 90 year figure come from? According to the ABARE report, Economic Demonstrated Resources are 39.2 giga-tonnes (Gt). Add to this another 8.3 Gt of “Sub-economic Demonstrated Resources”, or SDR, (i.e. reserves that are really hard to get) gives an estimate total of 47.5 Gt for Australia’s coal reserves. Now 90 × 490 Mt (the 2008 production rate) gives 44.1 Gt, which is somewhere between EDR and the combined total of EDR and SDR. Presumably the ABARE authors are allowing for the possibility that over time it will become economically feasible to mine some of the coal that is currently classified as sub-economic.

But there is no way that 2008 production rates will be kept steady for the next 90 years. Apart from anything else, there are plenty of stakeholders in the coal industry doing their best right now to see their export business grow.

To come up with a better estimate of how long the coal might last, rather than assuming zero production growth, I will assume a constant growth rate. While the annual growth rate from 1961 to 2008 averaged 5% per annum, growth has been a little slower more recently. The last 5 years have seen growth average only 3.1% (presumably the global financial crisis did not help). Working with the ABARE estimate that viable coal reserves are 90 times 2008 production levels and assuming 3.1% annual growth in production, the reserves will in fact only last for 43 years! That is less than half the 90 year figure in the ABARE report and it starts to seem like an awfully short period of time. Since the working life of coal-fired power stations is typically around 40 years, this means any new power stations built today would still work out their useful life, but they could be the last ones we build and extract the full value of their potential productivity.

Of course, if the growth rate is higher, the time to deplete the reserves will be lower, as is illustrated in the table below. In fact, if production growth returns a long run average of 5%, then reserves would only last 34 years.

Growth Rate Years Left
5%
34
4%
38
3%
44
2%
51
1%
64
0%
90

Reserves 90 times 2008 production

Optimists may counter that the ABARE estimates of the available reserves might be far too conservative. Perhaps there are coal fields out there just waiting to be discovered. Surely that would give us room to have coal export growth go gangbusters, wouldn’t it? Let’s see. I’ll be generous and assume that coal reserves are in fact twice as big (EDR has not changed much over the last 30 years). Running the figures again assuming reserves total 180 times 2008 production levels still means that with 3.1% annual production growth, the coal will all be gone in 60 years and if growth is 5%, it will only last 46 years.

Growth Rate Years Left
5%
46
4%
53
3%
62
2%
76
1%
103
0%
180

Reserves 180 times 2008 production

Now it may be the case that climate change will trigger disasters on such as scale that in 40 years time we are not too worried about coal production, nevertheless, these basic calculations mean that some or all of the following must be true.

  • Australian coal is going to run out in around 40 years
  • The coal industry cannot continue to grow at the rate it has done over the last 50 years
  • Australian energy will be turning to coal alternatives sooner that we may expect (with or without a carbon price)
  • There is a significant expansion in EDR in the future (much greater than we’ve seen over the last 30 years)

If we are going to stretch coal supplies beyond 40 years, what can slow down the need for production? With a price on carbon not looking likely to slow Australian energy consumption in the near future, one possibility would be to reduce the share of coal production that is exported and keep more of it for our own energy needs. After all, the export share has been growing quite rapidly.

Export Share II (chart)

Share of Australian coal production exported (1961-2008)

With around 66% going offshore, there is quite a bit that could be clawed back there. But who would dare suggest slowing export growth? Maybe we will just wake up one morning and discover, with a shock, that the coal is all gone and, since it is estimated that Australia has about 6% of the world’s coal reserves, the rest of the world may face the same realisation even sooner.

Data source: ABARE (note that the 2007-08 production figures in this data set look a little lower than the 490 Mt figure quoted in the report, this is because the chart shows saleable coal which is lower than total coal extracted).

UPDATE: there was initially an error on the export share chart. Thanks to @paulwallbank for pointing it out!

Generate your own Risk Characterization Theatre

In the recent posts Visualizing Smoking Risk and Shades of grey I wrote about the use of “Risk Characterization Theatres” (RCTs) to communicate probabilities. I found the idea in the book The Illusion of Certainty, by Eric Rifkin and Edward Bouwer. Here is how they explain the RCTs:

Most of us are familiar with the crowd in a typical theater as a graphic illustration of a population grouping. It occurred to us that a theater seating chart would be useful for illustrating health benefit and risk information. With a seating capacity of 1,000, our Risk Characterization Theater (RCT) makes it easy to illustrate a number of important values: the number of individuals who would benefit from screening tests, the number of individuals contracting a disease due to a specific cause (e.g., HIV and AIDS), and the merits of published risk factors (e.g., elevated cholesterol, exposure to low levels of environmental contaminants).

As regular readers would know, most of the charts here on the blog are produced using the statistics and graphics tool called R. The RCT graphics were no exception. Writing the code involved painstakingly reproducing Rifkin and Bouwer’s theatre floor plan (as well as a few of my own design, including the stadium). For the benefit of anyone who would like to try generating their own RCTs, I have published the code on github.

RCT (Shaded theatres)

Using the code is straightforward (once you have installed R). Copy the two files plans.Rdata and RCT.R onto your computer. Fire up R and switch to the directory containing the downloaded files. Load the code using the following command:

source("RCT.R")

You will then have a function available called rct which will generate the RCTs. Try the following examples:

rct(18)
rct(18, type="theatre")
rct(18, type="stadium")
rct(c(10, 8, 5))

The rct function has quite a few optional parameters to tweak the appearance of the theatre:

rct(cases, type=”square”, border=”grey”, fill=NULL, xlab=NULL, ylab=””, lab.cex=1, seed=NULL, label=FALSE, lab.col=”grey”, draw.plot=TRUE)

  • cases: single number or vector giving the number of seats to shade. If a vector is supplied, the values indicate how many seats of each colour to shade. The sum of this vector gives the total number of seats shaded
  • type: the floor plan to be used. Current options are “square”, “theatre” (the original Rifkin and Bouwer floor plan), “stadium” and “bigsquare”
  • border: the color for the outlines of the floor plan
  • fill: vector of colours for shading seats. If no value is supplied, the default is a sequence of shades of grey
  • xlab: text label to appear below floor plan. Default is “x cases in n”
  • lab.cex: character expansion factor (see ‘par’) to specify size of text labels (if any) on the floor plan
  • seed: specify the starting seed value for the random number generator. Setting this makes it possible to reproduce exactly the same shaded seats on successive calls of rct
  • label: if TRUE, any text labels for the specified floor plan will be displayed
  • lab.col: colour used for any text labels
  • draw.plot: if this is FALSE, the RCT is not drawn and instead a data frame is returned showing the seats that would have been shaded and the colours that would have been used

Risk Characterization Stadium

Shades of grey

The recent post on the risks of smoking looked at Rifkin and Bouwer’s “Risk Characterization Theatre” (RCT), a graphical device for communicating risks. The graphic in that post, which compared mortality rates of smokers and non-smokers taken from the pioneering British doctors smoking study, highlighted both the strengths and weaknesses of RCTs.

The charts certainly illustrate the risks of smoking in a striking way and seem to elicit a far stronger reaction than drier statistical tables or charts. I also suspect that, for many people, the charts succeed in conveying the relative risks more effectively than more traditional approaches. On the other hand, there is no doubt that RCTs are extremely inefficient. The smoking graphic required an awful lot of ink to represent a mere eight data points.

In the comments on the original post, it was suggested that a colour-coding scheme could be used to combine the charts for the different age ranges, reducing the inefficiency while still preserving the immediacy of the theatre graphic. I took that as a challenge, and here is the result. Returning to the Rifkin and Bouwer theatre floor plan, rather than the more prosaic squares, I have coded deaths in different age ranges with shades of grey: the earlier the death, the darker the grey.

RCT (Shaded theatres)

Mortality of doctors born between 1900 and 1930

The risks of smoking still come through clearly in this version of the chart, but the increased efficiency may come at the expense of a potential for confusion.

What do you think?

Natural frequencies

In my last post, I made a passing reference to Gerd Gigerenzer’s idea of using “natural frequencies” instead of probabilities to make assessing risks a little easier. My brief description of the idea did not really do justice to it, so here I will briefly outline an example from Gigerenzer’s book Reckoning With Risk.

The scenario posed is that you are conducting breast cancer screens using mammograms and you are presented with the following information and question about asymptomatic women between 40 and 50 who participate in the screening:

The probability that one of these women has breast cancer is 0.8%. If a woman has breast cancer, the probability is 90% that she will have a positive mammogram. If a woman does not have breast cancer, the probability is 7% that she will still have a positive mammogram. Imagine a woman who has a positive mammogram. What is the probability that she actually has breast cancer?

For those familiar with probability, this is a classic example of a problem that calls for the application of Bayes’ Theorem. However, for many people—not least doctors—it is not an easy question.

Gigerenzer posed exactly this problem to 24 German physicians with an average of 14 years professional experience, including radiologists, gynacologists and dermatologists. By far the most common answer was that there was a 90% chance she had breast cancer and the majority put the odds at 50% or more.

In fact, the correct answer is only 9% (rounding to the nearest %). Only two of the doctors came up with the correct answer, although two others were very close. Overall, a “success” rate of less than 20% is quite striking, particularly given that one would expect doctors to be dealing with these sorts of risk assessments on a regular basis.

Gigerenzer’s hypothesis was that an alternative formulation would make the problem more accessible. So, he posed essentially the same question to a different set of 24 physicians (from a similar range of specialties with similar experience) in the following way:

Eight out of every 1,000 women have breast cancer. Of these 8 women with breast cancer, 7 will have a positive mammogram. Of the remaining 992 women who don’t have breast cancer, some 70 will still have a positive mammogram. Imagine a sample of women who have positive mammograms in screening. How many of these women actually have breast cancer?

Gigerenzer refers to this type of formulation as using “natural frequencies” rather than probabilities. Astute observers will note that there are some rounding differences between this question and the original one (e.g. 70 out of 992 false positives is actually a rate of 7.06% not 7%), but the differences are small.

Now a bit of work has already been done here to help you on the way to the right answer. It’s not too hard to see that there will be 77 positive mammograms (7 true positives plus 70 false positives) and of these only 7 actually have breast cancer. So, the chances of someone in this sample of positive screens actually having cancer is 7/77 = 9% (rounding to the nearest %).

Needless to say, far more of the doctors who were given this formulation got the right answer. There were still some errors, but this time only 5 of the 24 picked a number over 50% (what were they thinking?).

The lesson is that probability is a powerful but confusing tool and it pays to think carefully about how to frame statements about risk if you want people to draw accurate conclusions.

Visualizing smoking risk

Risk is something many people have a hard time thinking about clearly. Why is that? In his book Risk: The Science and Politics of Fear, subtitled “why we fear the things we shouldn’t–and put ourselves in greater danger”, Dan Gardner surveyed many of the theories that have been used to explain this phenomenon. They range from simple innumeracy, to the influence of the media, or even the psychology of the short-cut “heuristics” (rules of thumb) we all use to make decisions quickly but that can also lead us astray.

In Reckoning With Risk, Gerd Gigerenzer argues that the traditional formulation of probability is particularly unhelpful, making calculations even harder than they should be. Studies have shown that even doctors struggle to handle probabilities correctly when explaining risks associated with illnesses and treatments. Gigerenzer instead proposed expressing risk in terms of “natural frequencies” (e.g. thinking in terms of 8 patients out of 1,000 rather than a 0.8% probability) and tests with general practitioners suggest that this kind of re-framing can be very effective.

The latest book on the subject that I have been reading is The Illusion of Certainty: Health Benefits and Risks by Erik Rifkin and Edward Bouwer. Rifkin and Bouwer are particularly critical of the common practice of reporting medical risks in terms of relative rather than absolute frequencies. When news breaks that a new treatment reduces the risk of dying from condition X  by 33%, should you be excited? That depends. This could mean that (absolute) risk of dying from X is currently 15% and the treatment brings this down to 10%. That would be big news. However, if the death rate from X is currently 3 in 10,000 and the treatment brings this down to 2 in 10,000 then the reduction in (relative) risk is still 33% but the news is far less exciting because the absolute risk of 3 in 10,000 is so much lower.

In an effort to facilitate the perception of risk, Rifkin and Bouwer devised an interesting graphical device. They note that it is particularly difficult to conceive and compare small risks, say a few cases in 1,000. In thinking about this problem, they came up with the idea of picturing a theatre with 1,000 seats and representing the cases as occupied seats in that theatre. They call the result a “Risk Characterization Theatre” (RCT). Here is an example to illustrate a 2% risk, or 20 cases in 1,000.

Risk Characterization Theatre

Now data visualization purists would be horrified by this picture. In The Visual Display of Quantitative Information, Edward Tufte argues that the “ink to data ratio” should be kept as low as possible, but the RCT uses a lot of ink just to display a single number! Still, I do think that the RCT can be an effective tool and perhaps this can be justified by thinking of it as a way of visualizing numbers rather than data (but maybe that’s a long bow).

Attractive though the theatre layout may be, there is probably no real need for the detail of the aisles, seating sections and labels, so here is a simpler version (again illustrating 20 in 1,000).

Simple Risk Characterization Theatre

To illustrate the use of RCTs, I’ll use one of the case studies from Rifkin and Bouwer’s book: smoking. One of the most significant studies of the health effects of smoking tracked the mortality of almost 35,000 British doctors (a mix of smokers and non-smokers). The study commenced in 1951 and the first results were published in 1954 and indicated a significantly higher incidence of lung cancer among smokers. The study ultimately continued until 2001 and and the final results were published in the 2004 paper Mortality in relation to smoking: 50 years’ observations on male British doctors.

The data clearly showed that, on average, smokers died earlier than non-smokers. The chart below would be the traditional way of visualizing this effect*.

Smoking Survival RatesSurvival of doctors born between 1900 and 1930

While it may be clear from this chart that being a smoker is riskier than being a non-smoker, thinking in terms of percentage survival rates may not be intuitive for everyone. Here is how the same data would be illustrated using RCTs. Appropriately, the black squares indicate a death (and for those who prefer the original layout, there is also a theatre version).

Smoking RCTsMortality of doctors born between 1900 and 1930

This is a rather striking chart. Particularly looking at the theatres for doctors up to 70 and 80 years old, the higher death rate of smokers is stark. However, the charts also highlight the inefficiency of the RCT. This graphic in fact only shows 8 of the 12 data points on the original charts.

So, the Risk Characterization Theatre is an interesting idea that may be a useful tool for helping to make numbers more concrete, but they are unlikely to be added to the arsenal of the serious data analyst.

As a final twist of the RCT, I have also designed a “Risk Characterization Stadium” which could be used to visualize even lower risks. Here is an illustration of 20 cases in 10,000 (0.2%).

Risk Characterization Stadium

* Note that the figures here differ slightly from those in Rifkin and Bower’s book. I have used data for doctors born between 1900 and 1930, whereas they refer to the 1900-1909 data but would in fact appear to have used the 1910-1919 data.

Bubbles to Brains

A couple of weeks ago I ranted about a bubble chart which attempted to illustrate trends in CDO issuance by large investment banks. If circles are a bad choice for depicting data, pictures of brains are even worse, but brains are what the BBC News designers settled on when it came to looking at the countries which have been most successful at winning Nobel prizes.

Nobel Brains - bad chart

There is no doubt that the idea to link Nobel prizes to brains is an appealing one, but comparing the relative sizes of these blobs of grey matter is not easy. In fact, it’s hard to avoid simply reading the numbers rather than looking at the graphics, which rather defeats the purpose of charting the data. A simple league table would have done the same job.

This would come as no surprise to William Cleveland, a statistician who took an experimental approach to understanding the effectiveness of different graphing techniques. In Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods, published jointly with Robert McGill in 1984, Cleveland ranked our ability to judge variation in charts in the following order:

  1. Position along a common scale
  2. Positions along nonaligned scales
  3. Length, direction, angle
  4. Area
  5. Volume, curvature
  6. Shading, color saturation

Furthermore, Cleveland’s experiments used circles rather than brains when area perception was tested and I suspect, brains would fall somewhere between four and five. This perception ranking also points at a better choice of graphic: a simple bar chart, which relies on judgement of length rather than area. Better still, since the bars have a common baseline, comparing them in fact requires judgement of position along a common scale, the easiest of the perception tasks.

Nobel Prize Bar ChartTop 5 Nobel Prize winning countries from 1901

The bar chart is much easier to read, but it may seem a little pedestrian to graphic designers excited by the idea of weaving in a brain image. While I am happy with the simple bar chart, sprucing it up with a background image does not interfere very much with the ease of reading the data. Here is an example, although I am sure those more adept at the use of Photoshop (or Gimp in this case) could come up with something better still.

Nobel Prize Bar Chart with Brain

The BBC post includes two more charts, which also have their shortcomings. The pie chart showing just how few women have won Nobel prizes is a particular waste of space. Certainly it is evident from the chart that women have not been awarded very many prizes, but simply stating in words that “the 41 of 806 prizes that went to women represent a mere 5.4%” does an even better job. Of course, the percentage could be added to the chart, but the necessity of adding a lot of numbers to a chart is a sure sign that the chart is not doing its job very well.