January 08, 2005

Medal Predictions Part I - Debunking Bernard & Busse

The last couple of weeks I have been thinking about the question of China. As they emerge as a sporting power, are they going to overwhelm the rest of the world? In other words, how good can China be?

I am going to get to that question, eventually, but not today. My investigation has led me down some other paths, though, one of which I thought was worth sharing. Earlier this summer you may have seen some news stories about a pair of American economists who were making predictions of Olympic medal totals. An example from the popular financial press can be found at CNN Money.

If you just read the news articles, it sounds pretty impressive; a couple of eggheads can accurately predict national medal totals using just a few simple statistics, and without knowing anything about sports. Wow. I figured that this was going to make it easy to answer my original question; all I would have to do was collect some economic forecasts for China's future, and presto, out would pop a predicted medal total. If only it were so easy.

The two economists in question, Andrew Bernard and Meghan Busse, have published this peer-reviewed article, including a description of their methods and their 2000 predictions. Bernard also has this "executive summary" of their 2004 predictions on his web site.

I'm not an economist, but I have trudged through the article, and I think I get most of it. The authors have attempted, using the last 30 years of Olympic medal totals, to create a model that will predict Olympic medal totals based on:

  • Population
  • Per-capita gross domestic product (GDP)
  • Host effect (whether a country is a host country or not)
  • "Sport economy" type (e.g. Soviet-bloc management)
  • History of Olympic performance
The predicitive power of the resulting model, as noted in the press, is actually fairly impressive. It appears that the dead-on predictions mentioned in the CNN Money article were actually wrong, due to an error in the implementation of the model, but even when the model is corrected the results are still good enough to be interesting.

Unfortunately, the predictions of the model are almost completely dominated by the last factor in my list, the history of Olympic performance. As Bernard and Busse note in their article, without the medal history, "the overall predictive power of the current model is lacking."

Let me illustrate how much history dominates the predictions. I decided to do some predictions of my own, to compare with the Bernard and Busse model. My model is called the "status quo" model, and it has one assumption: that every country will win the same fraction of the total medals that they won last time. Since there were basically the same number of medals awarded in 2004 as there were in 2000, this makes the prediction especially easy. Just look up the number from the 2000 medal table, and you've got your 2004 prediction. How did I do?

Figures 1 & 2

Figure 1: Total medal predictions (2004).

Figure 1 – Total medal predictions by Bernard & Busse and by status quo model, with actual totals, for 2004 Olympics (click to enlarge).

Figure 2: Total medal predictions (2000).

Figure 2 – Total medal predictions by Bernard & Busse and by status quo model, with actual totals, for 2000 Olympics (click to enlarge).

Figure 1 (inset) shows my "status quo" prediction, Bernard & Busse's "economic factors" prediction, and the actual totals for 2004. As you can see, the two predictions are about the same. I actually do better at predicting Greece, which apparently didn't get much "host effect;" we both underestimated Japan and Spain. Statistically speaking, my prediction was better overall, with an r.m.s. absolute error of 5.0 medals versus 5.3 for Bernard & Busse. Figure 2 shows the exact same thing for 2000, using 1996 as the status quo; in this case, I scaled the 1996 total by the fractional change in the total number of medals awarded (927/842). For 2000, Bernard & Busse did slightly better than my dumb model (6.4 r.m.s. error versus 6.5).

It's not that the Bernard & Busse model isn't a good predictor, because it actually does fine. It's just that the predictions aren't very illuminating. Essentially, we can conclude that a country wins lots of Olympic medals because it has won lots of Olympic medals before.

It would be fascinating to find a model that can give you a rough prediction of how many medals a country will win based on a small number of truly independent indicators. This model isn't it. To be fair, Bernard & Busse have identified a positive correlation between medal performance and some economic factors, which is interesting. But by their own admission, these correlations aren't strong enough, by themselves, to lead to useful predictions.


