SkyGrazer: Computer Modelling

Showing posts with label Computer Modelling. Show all posts

Friday, 25 July 2014

Who’s to Blame for our Changing Climate?

The term 'smoking gun' is often brought up in reference to climate change, a quick google search reveals that this phrase has been thrown around in climate circles at least for the last 20 years or so. Often, the 'smoking gun' is a reference to some single, unrefutable piece of evidence that might finally silence climate change deniers, such as the rising levels of CO₂ (e.g. by Julia Slingo, Chief Scientist of the Met Office). However, for most people carbon dioxide levels in the atmosphere are not particularly tangible while, for example, the floods afflicting the south-west of England last winter or the record summer seen in Austria and Slovenia are much more visible and closer to our everyday experiences. Attributing events like these to climate change is not always simple though; after extreme weather events there may be debate regarding whether the event (or the scale thereof) can be attributed to the effects of climate change; perhaps these might just be part of natural climate variability? Such discussions rarely result in any kind of satisfactory answer for the media and, I suspect, the general public. The reason for this is not, as commonly claimed, that a single event cannot possibly be attributed to any root cause (although this is largely true) but rather that natural climate variability and climate change are not separate. Any trend in overall climate variables (e.g. temperature) will underlie the natural variability and it is this that makes global warming so dangerous. It has been repeatedly said (largely as a joke) that an extra degree or two might make the weather in [insert country/state/county here] more bearable. However, this simplification of the global warming trend discounts the variation which has existed and would exist without any warming (or cooling) trend.

An increasing temperature moves climate variability with it.

In this image (taken from climatecommunication.org) you can see how temperatures vary around a central, average temperature^*. A shift in average temperature (which is what climate change/global warming implies) shifts the entire distribution to the right, i.e. towards hotter temperatures. That means that weather events that might exist in this portion of the plot...

... which were once the extreme end of the distribution, now become far more common. So attributing a weather event to climate change means that we are saying it falls in the red part of this inset plot, rather than the orange. We are not able to definitively do that. What we can do is measure the number of times that extreme events occur and see how that compares with our plot of variability. A new record temperature is bound to occur at some point, when a record temperature is reported as being a 1-in-1000 year event, that means we only expect a temperature that high to occur once every thousand years. If a temperature that high were to happen tomorrow, it is possible (likely even) that it was just random chance that it occurred when it did. However, if it happened again the month after, that looks a little suspicious. If we were to reach that temperature again in two years, then again in another 10, then we begin to cast real doubt on our definition of 1-in-1000 year event. Either our statistics and/or model were wrong in the first place, or the system has changed.

The evaluation of how often certain weather events should occur is a type of risk analysis. By analysing the number of times that events occur, we can say how likely they are to happen in the future. Given enough data, we can even say what the contributing factors to those events are. For example, the NHS and other medical institutions can evaluate the risk of developing lung cancer. Given data about the lifestyles of the people who do develop it, it is possible to draw correlations between factors such as smoking and the incidence of the disease. After further investigation it is possible to more firmly establish these links and therefore we can say that there are different risks of lung cancer for smokers vs non-smokers and what these risks are. The important thing to remember here is that these are probabilistic risks, we have all had a great aunt or other relative who smoked 80-a-day and lived to a ripe old age. At the same time, there are many unfortunate people who live exemplary, healthy lives, who will contract lung cancer nonetheless. These people represent the natural variability of this system, while the people who smoke have shifted the distribution of probability towards contracting lung cancer.

Having described this kind of analysis in perhaps too much detail, I can get to the point of this post - the study by Sophie Lewis and David Karoly, researchers at the University of Melbourne in the overwhelmingly appellated 'School of Earth Sciences and Australian Research Council Centre of Excellence for Climate System Science'. They have performed an analysis like that I've described for the extreme summer of 2013 in Australia. I'll link to the paper itself here, published in the journal Geophysical Review Letters, although I'm not sure about paywalls, etc. - apologies if it's not readily available to you.

Lewis and Karoly performed an extensive analysis using suites of models to determine exactly how likely the extreme heat seen in the summer of 2013 in Australia would be in the natural (no human contribution) course of events and then again with human contributions included. They extended this further to include the RCP8.5 emission scenario (covered in a previous blog here) running forward to 2020.

The Australian 'Angry Summer' of 2013 saw record-breaking temperatures on a daily, as well as seasonal basis with the all-time record holders for hottest day and hottest month occurring. By running large numbers ('ensembles') of climate models, some of which included human contributions to emissions and some which didn't, Lewis and Karoly were able to evaluate the probability that these contributions would result in such an extreme summer. In addition to their paper, the authors have published two blogs which sum up their findings very well here and here. Here they publish their plots which illustrate their findings that human contributions have increased the likelihood that the 'Angry Summer' would occur by a factor of five. The plots below show how models incorporating natural as well as anthropogenic contributions reveal dramatically increasing probabilities of raised temperatures when evaluated from 2006 onwards.

Probability distribution of average temperaturevariations across Australia in summer from observations (dashed line) and climate model simulations (solid line) for 1910-2005. The vertical lines mark the temperature departures for 1998 summer (the second hottest) and 2013 (the hottest) summer across Australia/ Lewis & Karoly

As above, but showing the shift in the probability distribution for 2006-2020 from climate model simulations including increasing greenhouse gases and other human influences on climate. Lewis & Karoly

It is worth digging into these results a bit, they are explained thoroughly by the authors in the paper and summarised well in their blog postings so I'm not going to repeat what they say. What is worth showing here is the spread of their model results. I think the plot below shows something that is often missing from statistical reports, climate or otherwise.

Australian annual temperature changes (relative to 1911-1940 average) for observations (dashed black) and model simulations with natural influences only (green) and with both human and natural influences (red). The grey plumes indicate the range of values simulated across nine global climate models used. Average Australian temperature anomalies are indicated for 2013 and the previous hottest year on record in 2005. David Karoly & Sophie Lewis

What this plot shows is not only the results from the various models (green showing climate variability arising from natural contributions only, red including human emission contributions) but also what the spread in those models looks like (in grey). This is very important as it is easy to see from the variation in observed temperatures that, for any given year, the red line and green line aren't really separated by more than we might expect from natural variations anyway. The grey spread of model results shows us that the green line, representing the 'natural' state, is now right on the edge of the feasible range predicted by our models. This means that we are now entering a period in which it is impossible (statistically) to account for current weather trends without incorporating the influence of human emissions. Australian Prime Minister Tony Abbott is fond of quoting the poet Dorothea Mackellar in her description of Australia as 'a land of droughts and flooding rains' in dismissing possible climate change. However, it has become completely untenable to ignore the changing climate in that country. Climate change deniers, who once might have charitably been called skeptics have descended into the realm of conspiracy theory. I won't link to any sites because I'd rather not give them any traffic but it is all too simple to search online (or simply look in the comments of legitimate blog posts) for climate change in Australia and find sites, no longer able to refute scientific findings, which now simply accuse scientists of falsifying data, proactively as well as retroactively.

One of the more legitimate plausible explanations for high temperatures in Australia is the El Niño Southern Oscillation (ENSO), which has been regularly linked to higher than average temperatures in the Pacific. It is true that the second hottest summer in Australia to date (1998) may well owe some of its heat to ENSO. However, 2013 was essentially an 'ENSO - neutral' year and so the record temperatures were almost certainly unaffected by it.

One last thing to mention about Australia's extreme climate (changing or not) is the absolutely phenomenal amount of rainfall experienced there in the last few years. In the two years preceding the 'Angry Summer' Australia was subject to exceptionally heavy rainfall, this time perhaps linked to an El Niño/La Niña event. While attributing this heavy rainfall to human influences is more muddled than with the record temperatures, I reiterate my earlier point that we can no longer take 'natural variability' in isolation from anthropogenic global warming. My main reason for bringing the rainfall is that I was struck by the fact that so much water fell on Australia in those two years that sea levels ceased to rise. Andrew Freedman blogs here in detail about this topic, the main gist being that the 3.2 mm/year sea level rise that has been observed for decades plateaued for an 18 month period correlating with the rains falling in Australia. The explanation posited in this study is that the particular geography of Australia prevented much of this water returning to the oceans on short timescales - therefore taking water from the oceans without returning it.

^*This is a bit simplified, this temperature distribution shows an essentially Gaussian distribution. There are good reasons why real temperature distributions might not be Gaussian but that's another story for another time... The general principle here will still stand.

Monday, 2 June 2014

Why the world cup will/won't be predicted by computer modelling.

You may have recently heard about attempts to predict the results of the upcoming World Cup in a scientific way. Although it's not mentioned explicitly the calculations by Stephen Hawking and Goldman-Sachs (GS) are the results of statistical modelling. Fortunately for me, this ties in well to a blog post I already wanted to write about just this subject.

The online betting company Paddy Power has employed Stephen Hawking for a month in order to calculate the probability that England will win the world cup. A more general aim of evaluating the overall outcome of the world cup has been undertaken by GS. I heard Peter Oppenheimer of GS interviewed last Thursday morning by John Humphrys and the interview in general was a really good example of the problems with the public perception of probability and statistics. The interview should be available for the next few days at least here and the bit I'm referring to was slightly before 7 am if you're trying to pinpoint it.

The interview consisted largely of Peter Oppenheimer explaining the details of their model, which include the past goal-scoring history of each team (as might be expected). Some more subtle analysis included the under- or over-performance of those teams at previous world cup tournaments as well as home vs away games. Hawking's analysis allowed for more intricate inputs, such as the height above sea level for the match. However, Hawking was approaching the problem from a different perspective, analysing the prospects of a single team, while GS were modelling the entire competition and the relative placings of every team.

What I found particularly blog-worthy about the Radio 4 interview was the attitude of John Humphrys. Humphrys, along with his radio 4 compatriot Melvin Bragg are exceptionally intelligent men and yet they are often dismissive of vital aspects of the scientific method. I am dragging Bragg into this because I have heard several episodes of In Our Time which have a scientific theme in which he happily confesses his ignorance of maths and science to his guests. This disregard would be very poorly received if it related to a knowledge of British history, for example. However, numerical and scientific theorems do not appear to warrant the same level of esteem.

I digress, in the interview Humphrys was incredulous of Oppenheimer's World Cup predictions. Notably he said something like 'I don't even follow football, yet I can probably tell you the four teams that will end up in the semi-finals' (apologies if this is horribly paraphrased, I am unable to listen to the interview again at the moment). Implying perhaps that the work by GS was worthless as it only told us something that could be guessed at by a layman anyway. There are two important points here. Firstly, I think that this misses the point entirely. The phrase that springs to mind is 'when you do something right, people won't be sure you've done anything at all' (if anyone can trace this quote back further than the Futurama link I've pasted please let me know!). Humphrys statement actually shows us that there is really an intuitive element to probability which isn't always evident, especially when it comes to some of the more esoteric results of probability theory, such as the still-argued-over Monty Hall problem. If a layman can predict the four teams to reach the semi-finals of the World Cup 2014, why scoff at attempts to do the same thing in a numerical, analytical manner? Why does John think he can predict the semi-finalists? Because he is aware that Brazil, Germany, Argentina and Spain are probably the best teams in the world, even if he gained this awareness through osmosis, something very easy to do, at least in the UK. Why are these teams the best (or perceived to be so)? Because they win a lot, meaning that they have good, measurable goal scores and differences - exactly the kind of variable that is input into the GS model. Even I, as a complete football luddite, know that Brazil are extremely likely to beat the U.S. at 'soccer' (no offence U.S.). This is because I have been brought up with images of Pelé as the messiah of football while there's only a grudging willingness to acknowledge the participation of the U.S. in the same sport because, you know, at least they give it a go.

The second point I want to make is that I think Humphrys misunderstands the language of probability that Oppenheimer is using. I think that this represents a fundamental lack of public understanding of probability. Scientists understand that very few hypothesis can ever truly be ruled out completely and so may appear vague or uncertain about their findings. This allows scientific theories to be cast as 'doubtful' when they are, in fact, remarkably certain.

We (or at least John Humphrys) seem to have some inbuilt desire for our models to be 'deterministic', meaning that there will be an exact, predictable outcome. The alternative being presented by GS is for a 'probabilistic' result. The difference between these two interpretations is rather philosophical and so somewhat loosely defined, this is probably (heh) why it's not easily digested by the public at large. A glib explanation of this difference comes from the Terry Pratchett book 'The Colour of Magic' in which a character flips several coins. The deterministic philosophy would lead us to expect that half of the flips land on heads while half land on tails, in the book what happens is that four of the coins land on the coins' edge while another turns into a caterpillar. While it's unlikely that a probabilistic analysis of coin flipping could predict these outcomes, it may be able to account for a very slightly weighted coin, or perhaps a coin flipper who consistently puts the coin a particular way up before flipping. Either of these elements (and more) could contribute to the outcome of a coin flip being other than 50:50. This might actually be important to you if you really care about the outcome of 1,000 coin flips. Worse, you might care about the outcome of the World Cup, particularly where England are involved. That might mean you care about El Niño and its intensity this year, not because there's some spooky coincidence between that intensity and how well England perform but because a strong El Niño might make it hot and dry in Brazil during the competition, probably not a good thing for footballers who are used to playing in the cold and wet.

Our ability to model things like El Niño or, heaven forfend, the entire Earth climate, depend on things that have far more effect on outcomes than a slightly weighted coin or a sneaky flipper. They depend on things like knowing the sea temperature in the middle of the pacific and how exactly that temperature varies with the depth of the ocean. Even our best measurements of such quantities are subject to errors as banal as being mistyped by a sleepy meteorologist or as sophisticated as rounding error in a big-endian vs little-endian machine. These errors have the potential to grow and lead to larger and larger effects. When climate modelling reveals emergent properties that have large effects, such as hurricanes, it becomes vitally important that those properties are not being unduly amplified by errors in your inputs. For an excellent overview of how climate modelling works in just this context see Gavin Schmidt's TED talk, 'The Emergent Patterns of Climate Change'. Of course, when running models as complex as climate modelling, essentially trying to recreate the entire Earth system with computer simulations we have to bear in mind that 'garbage in = garbage out' and that if our measurements are not at least correct on average then our model is likely to be meaningless (though possibly still informative). Knowing that it is perfectly possible that errors do creep in though, enables us to be probabilistic about our analysis. For example, we want to be absolutely sure that, when we are looking at our climate model we are not looking at that tiny fraction of coin flips that land on the edge (or turn into a caterpillar). We can do this by running our climate models again and again and then analysing the results of all of the different simulations in a statistical way. It is for this reason that scientists cite a percentage likelihood that an event will occur. Rather than hedging their bets, they are simply telling you how many times a certain event will occur, given certain conditions. This may be seen as dodging the question but it is actually an attempt to be utterly transparent and honest about results.

The statistician George E.P. Box wrote that 'essentially, all models are wrong, but some are useful'. This summarises beautifully our inability to fully recreate complex systems in simulations, we can only extrapolate and interpret.

Friday, 16 May 2014

Something in the Air

A lot happens at the Met Office that goes largely unreported upon. For example, planting transmitters on seals to measure sea temperature might not be the first thing to cross your mind if you were asked what the Met Office actually does. As I listened to a talk this week about tracking the spread of atmospheric particles I realised that this was something else that would fall under this umbrella. Time for a blog post!

The talk was by the Atmospheric Dispersion and Quality (ADAQ) group who are responsible for some very interesting aspects of the MO services like supporting the emergency services in the event of civil contingencies like chemical fires, radioactive accidents, volcanic ash and animal and plant health. This is achieved through the use of NAME - the Nuclear Accident ModEl, one of the more outrageous examples of acronym abuse I've come across.

NAME was developed by the MO following the Chernobyl disaster in 1986 when weather conditions conspired to spread the released radioactive particles across Europe, including the Welsh hills. How exactly this happened can be seen in the model image below.

Since then, NAME has been through multiple iterations, capable of predicting the transport, dispersion and chemistry of atmospheric particles. If you're interested in the gritty (haha) details then I can tell you that it does this through the modelling of core atmospheric processes such as turbulence, deep convection, deposition & sedimentation^* and chemistry. If you want to know exactly how it does that then here would be a good place to start.

^*material removed from atmosphere by transport to, and uptake by the ground. Gravitational settling, rain 'washout' (material is brought down to ground by rain), rain absorption (precipitation forms around particles directly).

The latest generation of NAME is NAME III and this has been used extensively in recent times to track the effects of the Fukushima Daiichi nuclear disaster, the second event ever to reach the highest rating of 7 on the International Nuclear and Radiological Event Scale. Research into the health effects of the Fukushima disaster is ongoing, incorporating the results of NAME's model analysis.

NAME is supported by many tools which work over different scales, interesting in various ways. In order of increasing scale over which they function:

PACRAM (Procedures And Communications in the event of a release of Radioactive Material) gives little information generally but the main priority is to be fast so as to advise emergency services, etc. on possible hazardous directions or areas to avoid in the event of a UK nuclear power plant event.

RIMNET (not sure if this is a really convoluted acronym or just a name...) a Met Office-managed project in partnership with DECC and DEFRA. A country-wide network of gamma radiation detectors (isn't this a plot device from the Avengers?!) which allow the UK to monitor background radiation levels. All measurement and reference data is stored in the UK National Nuclear Database.
Regional Specialized Meteorological Centers and the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO) give the international radiological response. The CTBTO (actually a preparatory commission as the treaty is not yet law) are tasked with establishing and developing a worldwide network which monitors the planet for nuclear explosions. This network is reportedly 85 percent complete at the time of writing.

One of the more useful aspects of atmospheric modelling is that it can be run backwards to establish the source of an atmospheric feature. For example, if a non-reported nuclear event were to occur, this can be traced back to its source through inputting current observations into the NAME model.

This feature has proved particularly useful in disease control such as in the outbreak of Legionnaires Disease in Edinburgh in 2012. Not only can the model predict the spread of airborne bacteria and so inform the public and authorities if certain areas are at particularly high risk but, once an infection has been found, the model can be run backwards to see where the bacteria might have originated from in the first place. Useful again in the case of animal and plant health. The Met Office has been researching the spread of Foot and Mouth Disease since the 1960s, again through the dispersion in the atmosphere of airborne particles originating from infected pigs.

There is more use to this than might be immediately obvious, vaccines are often limited in amount, especially in the case of a sudden outbreak. By identifying the likely spread of diseases, the vaccines can be distributed in a targeted way.

There are yet more applications of this technology and, to be honest, I wasn't particularly familiar with them before the talk. I'd heard of 'Ash dieback', apparently spread on the small scale (up to 10s of miles) by windborne spores but what has apparently been called the 'polio of wheat', UG99, is also the subject of Met Office research.