Monday, 2 June 2014

Why the world cup will/won't be predicted by computer modelling.


You may have recently heard about attempts to predict the results of the upcoming World Cup in a scientific way. Although it's not mentioned explicitly the calculations by Stephen Hawking and Goldman-Sachs (GS) are the results of statistical modelling. Fortunately for me, this ties in well to a blog post I already wanted to write about just this subject.

The online betting company Paddy Power has employed Stephen Hawking for a month in order to calculate the probability that England will win the world cup. A more general aim of evaluating the overall outcome of the world cup has been undertaken by GS. I heard Peter Oppenheimer of GS interviewed last Thursday morning by John Humphrys and the interview in general was a really good example of the problems with the public perception of probability and statistics. The interview should be available for the next few days at least here and the bit I'm referring to was slightly before 7 am if you're trying to pinpoint it.

The interview consisted largely of Peter Oppenheimer explaining the details of their model, which include the past goal-scoring history of each team (as might be expected). Some more subtle analysis included the under- or over-performance of those teams at previous world cup tournaments as well as home vs away games. Hawking's analysis allowed for more intricate inputs, such as the height above sea level for the match. However, Hawking was approaching the problem from a different perspective, analysing the prospects of a single team, while GS were modelling the entire competition and the relative placings of every team.

  What I found particularly blog-worthy about the Radio 4 interview was the attitude of John Humphrys. Humphrys, along with his radio 4 compatriot Melvin Bragg are exceptionally intelligent men and yet they are often dismissive of vital aspects of the scientific method. I am dragging Bragg into this because I have heard several episodes of In Our Time which have a scientific theme in which he happily confesses his ignorance of maths and science to his guests. This disregard would be very poorly received if it related to a knowledge of British history, for example. However, numerical and scientific theorems do not appear to warrant the same level of esteem.

  I digress, in the interview Humphrys was incredulous of Oppenheimer's World Cup predictions. Notably he said something like 'I don't even follow football, yet I can probably tell you the four teams that will end up in the semi-finals' (apologies if this is horribly paraphrased, I am unable to listen to the interview again at the moment). Implying perhaps that the work by GS was worthless as it only told us something that could be guessed at by a layman anyway. There are two important points here. Firstly, I think that this misses the point entirely. The phrase that springs to mind is 'when you do something right, people won't be sure you've done anything at all' (if anyone can trace this quote back further than the Futurama link I've pasted please let me know!).  Humphrys statement actually shows us that there is really an intuitive element to probability which isn't always evident, especially when it comes to some of the more esoteric results of probability theory, such as the still-argued-over Monty Hall problem. If a layman can predict the four teams to reach the semi-finals of the World Cup 2014, why scoff at attempts to do the same thing in a numerical, analytical manner? Why does John think he can predict the semi-finalists? Because he is aware that Brazil, Germany, Argentina and Spain are probably the best teams in the world, even if he gained this awareness through osmosis, something very easy to do, at least in the UK. Why are these teams the best (or perceived to be so)? Because they win a lot, meaning that they have good, measurable goal scores and differences - exactly the kind of variable that is input into the GS model. Even I, as a complete football luddite, know that Brazil are extremely likely to beat the U.S. at 'soccer' (no offence U.S.). This is because I have been brought up with images of Pelé as the messiah of football while there's only a grudging willingness to acknowledge the participation of the U.S. in the same sport because, you know, at least they give it a go.

  The second point I want to make is that I think Humphrys misunderstands the language of probability that Oppenheimer is using. I think that this represents a fundamental lack of public understanding of probability. Scientists understand that very few hypothesis can ever truly be ruled out completely and so may appear vague or uncertain about their findings. This allows scientific theories to be cast as 'doubtful' when they are, in fact, remarkably certain.

  We (or at least John Humphrys) seem to have some inbuilt desire for our models to be 'deterministic', meaning that there will be an exact, predictable outcome. The alternative being presented by GS is for a 'probabilistic' result. The difference between these two interpretations is rather philosophical and so somewhat loosely defined, this is probably (heh) why it's not easily digested by the public at large. A glib explanation of this difference comes from the Terry Pratchett book 'The Colour of Magic' in which a character flips several coins. The deterministic philosophy would lead us to expect that half of the flips land on heads while half land on tails, in the book what happens is that four of the coins land on the coins' edge while another turns into a caterpillar. While it's unlikely that a probabilistic analysis of coin flipping could predict these outcomes, it may be able to account for a very slightly weighted coin, or perhaps a coin flipper who consistently puts the coin a particular way up before flipping. Either of these elements (and more) could contribute to the outcome of a coin flip being other than 50:50. This might actually be important to you if you really care about the outcome of 1,000 coin flips. Worse, you might care about the outcome of the World Cup, particularly where England are involved. That might mean you care about El Niño and its intensity this year, not because there's some spooky coincidence between that intensity and how well England perform but because a strong El Niño might make it hot and dry in Brazil during the competition, probably not a good thing for footballers who are used to playing in the cold and wet.

Our ability to model things like El Niño or, heaven forfend, the entire Earth climate, depend on things that have far more effect on outcomes than a slightly weighted coin or a sneaky flipper. They depend on things like knowing the sea temperature in the middle of the pacific and how exactly that temperature varies with the depth of the ocean. Even our best measurements of such quantities are subject to errors as banal as being mistyped by a sleepy meteorologist or as sophisticated as rounding error in a big-endian vs little-endian machine. These errors have the potential to grow and lead to larger and larger effects. When climate modelling reveals emergent properties that have large effects, such as hurricanes, it becomes vitally important that those properties are not being unduly amplified by errors in your inputs. For an excellent overview of how climate modelling works in just this context see Gavin Schmidt's TED talk, 'The Emergent Patterns of Climate Change'. Of course, when running models as complex as climate modelling, essentially trying to recreate the entire Earth system with computer simulations we have to bear in mind that 'garbage in = garbage out' and that if our measurements are not at least correct on average then our model is likely to be meaningless (though possibly still informative). Knowing that it is perfectly possible that errors do creep in though, enables us to be probabilistic about our analysis. For example, we want to be absolutely sure that, when we are looking at our climate model we are not looking at that tiny fraction of coin flips that land on the edge (or turn into a caterpillar). We can do this by running our climate models again and again and then analysing the results of all of the different simulations in a statistical way. It is for this reason that scientists cite a percentage likelihood that an event will occur. Rather than hedging their bets, they are simply telling you how many times a certain event will occur, given certain conditions. This may be seen as dodging the question but it is actually an attempt to be utterly transparent and honest about results.

The statistician George E.P. Box wrote that 'essentially, all models are wrong, but some are useful'. This summarises beautifully our inability to fully recreate complex systems in simulations, we can only extrapolate and interpret.

No comments:

Post a Comment