Possible futures

PLEASE NOTE: this is the last post to be mirrored from PLOS, so please subscribe to my PLOS RSS feed to avoid missing future posts. Thanks!

This is the final part of a series of introductory posts about the principles of climate modelling. Others in the series: 1 | 2 | 3 | 4

The final big question in this series is:

How do we predict our future?

Everything I’ve discussed so far has been about how to describe the earth system with a computer model. How about us? We affect the earth. How do we predict what our political decisions will be? How much energy we’ll use, from which sources? How we’ll use the land? What new technologies will emerge?

We can’t, of course.

Instead we make climate predictions for a set of different possible futures. Different storylines we might face.

Here is a graph of past world population, and the United Nations predictions for the future using four possible fertility rates.

Data source: Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat, World Population Prospects: The 2010 Revision
Data source: Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat, World Population Prospects: The 2010 Revision

The United Nations aren’t trying to answer the question, “What will the world population be in the future?”. That depends on future fertility rates, which are impossible to predict. Instead they ask “What would the population be if fertility rates were to stay the same? Or decrease a little, or a lot?”

We do the same for climate change: not “How will the climate change?” but “How would the climate change if greenhouse gas emissions were to keep increasing in the same way? Or decrease a little, or a lot?” We make predictions for different scenarios.

Here is a set of scenarios for carbon dioxide (CO2) emissions. It also shows past emissions at the lefthand side.

These scenarios are named “SRES” after the Special Report on Emissions Scenarios in 2000 that defined the stories behind them. For example, the A1 storyline is

“rapid and successful economic development, in which regional average income per capita converge – current distinctions between “poor” and “rich” countries eventually dissolve”,

and within it the A1F1 A1FI scenario is the most fossil-fuel intensive. The B1 storyline has

“a high level of environmental and social consciousness combined with a globally coherent approach to a more sustainable development”,

though it doesn’t include specific political action to reduce human-caused climate change. The scenarios describe CO2 and other greenhouse gases, and other industrial emissions (such as sulphur dioxide) that affect climate.

We make climate change predictions for each scenario; endings to each story. Here are the predictions of temperature.

Each broad line is an estimate of a conditional probability: the probability of a temperature increase, given a particular scenario of emissions.

Often this kind of prediction is called a projection to mean it is a “what would happen if” not a “what will happen”. But people do use the two interchangeably, and trying to explain the difference is what got me called disingenuous and Clintonesque.

We make projections to help distinguish the effects of our possible choices. There is uncertainty about these effects, shown by the width of each line, so the projections can overlap. For example, the highest temperatures for the B1 storyline (environmental, sustainable) are not too different from the lowest temperatures for A1F1 A1FI (rapid development, fossil intensive). Our twin aims are to try to account for every uncertainty, and to try to reduce them, to make these projections the most reliable and useful they can be.

A brief aside: the approach to ‘possible futures’ is now changing. It’s a long and slightly tortuous chain to go from storyline to scenario, then to industrial emissions (the rate we put gases into the atmosphere), and on to atmospheric concentrations (the amount that stays in the atmosphere without being, for example, absorbed by the oceans or used by plants). So there has been a move to skip straight to the last step. SRES are being replaced with Representative Concentration Pathways (“RCP”).

The physicist Niels Bohr helpfully pointed out that

“Prediction is very difficult, especially if it’s about the future.”

And the ever-wise Douglas Adams added

“Trying to predict the future is a mug’s game.”

They’re right. Making predictions about what will happen to our planet is impossible. Making projections about what might happen, if we take different actions, is difficult, for all the reasons I’ve discussed in this series of posts. But I hope, as I said in the first, I’ve convinced you it is not an entirely crazy idea.

 

Junk (filter) science

This is a mirror of a PLOS blogpost. Formatting is usually nicer there.

This is part 4 of a series of introductory posts about the principles of climate modelling. Others in the series: 1 | 2 | 3

In the previous post I said there will always be limits to our scientific understanding and computing power, which means that “all models are wrong.” But it’s not as pessimistic as this quote from George Box seems, because there’s a second half: “… but some are useful.” A model doesn’t have to be perfect to be useful. The hard part is assessing whether a model is a good tool for the job. So the question for this post is:

How do we assess the usefulness of a climate model?

I’ll begin with another question: what does a spam (junk email) filter have in common with state-of-the-art predictions of climate change?

Wall of SPAM
Modified from a photo by freezelight

The answer is they both improve with “Bayesian learning”. Here is a photo of the grave of the Reverend Thomas Bayes, which I took after a meeting at the Royal Statistical Society (gratuitous plug of our related new book, “Risk and Uncertainty Assessment for Natural Hazards”):

Bayes' grave

Bayesian learning starts with a first guess of a probability. A junk email filter has a first guess of the probability of whether an email is spam or not, based on keywords I won’t repeat here. Then you make some observations, by clicking “Junk” or “Not Junk” for different emails. The filter combines the observations with the first guess to make a better prediction. Over time, a spam filter gets better at predicting the probability that an email is spam: it learns.

The filter combines the first guess and observations using a simple mathematical equation called Bayes’ theorem. This describes how you calculate a “conditional probability”, a probability of one thing given something else. Here this is the probability that a new email is spam, given your observations of previous emails. The initial guess is called the “prior” (first) probability, and the new guess after comparing with observations is called the “posterior” (afterwards) probability.

The same equation is used in many state-of-the-art climate predictions. We use a climate model to make a first guess at the probability of future temperature changes. One of the most common approaches for this is to make predictions using many different plausible values of the model parameters (control dials): each “version” of the model gives a slightly different prediction, which we count up to make a probability distribution. Ideally we would compare this initial guess with observations, but unfortunately these aren’t available without (a) waiting a long time, or (b) inventing a time machine. Instead, we also use the climate model to “predict” something we already know, to make a first guess at the probability of something in the past, such as temperature changes from the year 1850 to the present. All the predictions of the future have a twin “prediction of the past”.

We take observations of past temperature changes – weather records – and combine them with the first guess from the climate model using Bayes’ theorem. The way this works is that we test which versions of the model from the first guess (prior probability) of the past are most like the observations: which are the most useful. We then apply those “lessons” by giving these the most prominence, the greatest weight, in our new prediction (posterior probability) of the future. This doesn’t guarantee our prediction will be correct, but it does mean it will be better because it uses evidence we have about the past.

Here’s a graph of two predictions of the probability of a future temperature change (for our purposes it doesn’t matter what) from the UK Climate Projections:

The red curve (prior) is the first guess, made by trying different parameter values in a climate model. The predicted most probable value is a warming of about three degrees Celsius. After including evidence from observations with Bayes’ theorem, the prediction is updated to give the dark blue curve (posterior). In this example the most probable temperature change is the same, but the narrower shape reflects a higher predicted probability for that value.

Probability in this Bayesian approach means “belief” about the most probable thing to happen. That sounds strange, because we think of science as objective. One way to think about it is the probability of something happening in the future versus the probability of something that happened in the past. In the coin flipping test, three heads came up out of four. That’s the past probability, the frequency of how often it happened. What about the next coin toss? Based on the available evidence – if you don’t think the coin is biased, and you don’t think I’m trying to bias the toss – you might predict that the probability of another head is 50%. That’s your belief about what is most probable, given the available evidence.

My use of the word belief might trigger accusations that climate predictions are a matter of faith. But Bayes’ theorem and the interpretation of “probability” as “belief” are not only used in many other areas of science, they are thought by some to describe the entire scientific method. Scientists make a first guess about an uncertain world, collect evidence, and combine these together to update their understanding and predictions. There’s even evidence to suggest that human brains are Bayesian: that we use Bayesian learning when we process information and respond to it.

The next post will be the last in the introductory series on big questions in climate modelling: how can we predict our future?

 

Tuning to the climate signal

This is a mirror of a PLOS blogpost.

 

This is part 3 of a series of introductory posts about the principles of climate modelling. Others in the series: 1 | 2

My sincere apologies for the delays in posting and moderating. Moving house took much more time and energy than I expected. Normal service resumes.

I’d also like to mark the very recent passing of George Box, who was the eminent and important statistician to whom I owe the name of my blog, which forms a core part of my scientific values. The ripples of his work and philosophy travelled very far. My condolences and very best wishes to his family.

The question I asked at the end of the last post was:

“Can we ever have a perfect reality simulator?”

I showed a model simulation with pixels (called “grid boxes” or “grid cells”) a few kilometres across: in other words, big. Pixel size, also known as resolution, is limited by available computing power. If we had infinite computing power how well could we do? Imagine we could build a climate model representing the entire “earth system” – atmosphere, oceans, ice sheets and glaciers, vegetation and so on – with pixels a metre across, or a centimetre. Pixels the size of an atom. If we could do all those calculations, crunch all those numbers, could we have a perfect simulator of reality?

I’m so certain of the answer to this question, I named my blog after it.

A major difficulty with trying to simulate the earth system is that we can’t take it to pieces to see how it works. Climate modellers are usually physicists by training, and our instinct when trying to understand a thing is to isolate sections of it, or to simplify and abstract it. But we have limited success if we try to look at isolated parts of the planet, because everything interacts with everything else, and difficulties with simplifications, because important things happen at every scale in time and space. We need to know a bit about everything. This is one of my favourite things about the job, and one of the most difficult.

For a perfect simulation of reality, we would need perfect understanding of every physical, chemical and biological process – every interaction and feedback, every cause and effect. We are indeed improving climate models as time goes on. In the 1960s, the first weather and climate models simulated atmospheric circulation, but other important parts of the earth system such as the oceans, clouds, and carbon cycle were either included in very simple ways (for example, staying fixed rather than changing through time) or left out completely. Through the decades we have developed the models, “adding processes”, aiming to make them better simulators of reality.

But there will always be processes we think are important but don’t understand well, and processes that happen on scales smaller than the pixel size, or faster than the model “timestep” (how often calculations are done, like the frame rate of a film). We include these, wherever possible, in simplified form. This is known as parameterisation.

Parameterisation is a key part of climate modelling uncertainty, and the reason for much of the disagreement between predictions. It is the lesser of two evils when it comes to simulating important processes: the other being to ignore them. Parameterisations are designed using observations, theoretical knowledge, and studies using very high resolution models.

For example, clouds are much smaller than the pixels of most climate models. Here is the land map from a lower resolution climate model than the one in the last post, called HadCM3.

Land-sea map for UK Met Office Unified Model HadCM3

If each model pixel could only show “cloud” or “not cloud”,  then a simulation of cloud cover would be very unrealistic: a low resolution, blocky map where each block of cloud is tens or even hundreds of kilometres across. We would rather each model pixel was covered in a percentage of cloud, rather than 0% or 100%. The simplest way to do this is to relate percentage cloud to percentage relative humidity: at 100% relative humidity, the pixel is 100% covered in cloud; as relative humidity decreases, so does cloud cover.

Parameterisations are not Laws of Nature. In a sense they are Laws of Models, designed by us wherever we do not know, or cannot use, laws of nature. Instead of “physical constants” that we measure in the real world, like the speed of light, they have “parameters” that we control. In the cloud example, there is a control dial for the lowest relative humidity at which cloud can form. This critical threshold doesn’t exist in real life, because the world is not made of giant boxes. Some parameters are equivalent to things that exist, but for the most part they are “unphysical constants”.

The developers of a model play god, or at least play a car radio, by twiddling these control dials until they pick up the climate signal: in other words, they test different values of the parameters to find the best possible simulation of the real world. For climate models, the test is usually to reproduce the changes of the last hundred and fifty years or so, but sometimes to reproduce older climates such as the last ice age. For models of Greenland and Antarctica, we only have detailed observations from the last twenty years.

As our understanding improves and our computing power increases, we replace the parameterisations with physical processes. But we will never have perfect understanding of everything, nor infinite computing power to calculate it all. Parameterisation is a necessary evil. We can never have a perfect reality simulator, and all models are… imperfect.

In case you do lie awake worrying that the entire universe is a simulation: it’s fine, we can probably check that.

 

 

Virtually Reality

This is part 2 of a series of introductory posts about the principles of climate modelling. Others in the series: 1.

The second question I want to discuss is this:

How can we do scientific experiments on our planet?

In other words, how do we even do climate science? Here is the great, charismatic physicist Richard Feynman, describing the scientific method in one minute:

If you can’t watch this charming video, here’s my transcript:

“Now I’m going to discuss how we would look for a new law. In general, we look for a new law by the following process:

First, we guess it.

Then we — no, don’t laugh, that’s the real truth — then we compute the consequences of the guess to see what, if this is right, if this law that we guessed is right, we see what it would imply.

And then we compare the computation result to nature, or we say compare to experiment, or experience, compare it directly with observations to see if it works.

If it disagrees with experiment, it’s wrong. In that simple statement is the key to science. It doesn’t make a difference how beautiful your guess is, it doesn’t make a difference how smart you are, who made the guess, or what his name is — if it disagrees with experiment, it’s wrong. That’s all there is to it.”

What is the “experiment” in climate science? We don’t have a mini-Earth in a laboratory to play with. We are changing things on the Earth, by farming, building, and putting industrial emissions into the atmosphere, but it’s not done in a systematic and rigorous way. It’s not a controlled experiment. So we might justifiably wonder how we even do climate science.

Climate science is not the only science that can’t do controlled experiments of the whole system being studied. Astrophysics is another: we do not explode stars on a lab bench. Feynman said that we can compare with experience and observations. We would prefer to experience and observe things we can control, because it is much easier to draw conclusions from the results. Instead we can only watch as nature acts.

What is the “guess” in climate science? These are the climate models. A model is just a representation of a thing (I wrote more about this here). A climate model is a computer program that represents the whole planet, or part of it.* It’s not very different to a computer game like Civilisation or SimCity, in which you have a world to play with, in which you can tear up forests and build cities. In a climate model we can do much the same: replace forests with cities, alter the greenhouse gas concentrations, let off volcanoes, change the energy reaching us from the sun, move the continents. The model produces a simulation of how the world responds to those changes: how they affect temperature, rainfall, ocean circulation, the ice in Antarctica, and so on.

How do they work? The general idea is to stuff as much science as possible into them without making them too slow to use. At the heart of them are basic laws of physics, like Newton’s laws of motion and the laws of thermodynamics. Over the past decades we’ve added more to them: not just physics but also chemistry, such as the reactions between gases in the atmosphere; biological processes, like photosynthesis; and geology, like volcanoes. The most complicated climate models are extremely slow. Even on supercomputers it can take many weeks or months to get the results.

Here is a state-of-the-art simulation of the Earth by NASA.

The video shows the simulated patterns of air circulation, such as the northern hemisphere polar jet stream, then patterns of ocean circulation, such as the Gulf Stream. The atmosphere and ocean models used to make this simulation are high resolution: they have a lot of pixels so, just like in a digital camera, they show a lot of detail.

A horizontal slice through this atmosphere model has 360 x 540 pixels, or 0.2 megapixels. That’s about two thirds as many as a VGA display (introduced by IBM in 1987) or the earliest consumer digital camera (the Apple QuickTake from 1994). It’s also about the same resolution as my blog banner. The ocean model is a lot higher resolution: 1080 x 2160 pixels, or 2.3 megapixels, which is about the same as high definition TV. The video above has had some extra processing to smooth the pixels out and draw the arrows.

I think it’s quite beautiful. It also seems to be very realistic, a convincing argument that we can simulate the Earth successfully. But the important question is: how successfully? This is the subject of my next post:

Can we ever have a perfect “reality simulator”?

The clue’s in the name of the blog…

See you next time.

 

* I use the term climate model broadly here, covering any models that describe part of the planet. Many have more specific names, such as “ice sheet model” for Antarctica.

 

We have nothing to fear

[This is a mirror of a post published at PLOS. Formatting may be better over there.]

 

I’m scared.

I must be, because I’ve been avoiding writing this post for some time, when previously I’ve been so excited to blog I’ve written until the early hours of the morning.

I’m a climate scientist in the UK. I’m quite early in my career: I’ve worked in climate science for six and a half years since finishing my PhD in physics. I’m not a lecturer or a professor, I’m a researcher with time-limited funding. And in the past year or so I’ve spent a lot of time talking about climate science on Twittermy blog and in the comments sections of a climate sceptic blog.

So far I’ve been called a moron, a grant-grubber, disingenuous, and Clintonesque (they weren’t a fan: they meant hair-splitting), and I’ve had my honesty and scientific objectivity questioned. I’ve been told I’m making a serious error, a “big, big mistake”, that my words will be misunderstood and misused, and that I have been irritating in imposing my views on others. You might think these insults and criticisms were all from climate sceptics disparaging my work, but those in the second sentence are from a professor in climate change impacts and a climate activist. While dipping my toes in the waters of online climate science discussion, I seem to have been bitten by fish with, er, many different views.

I’m very grateful to PLOS for inviting me to blog about climate science, but it exposes me to a much bigger audience. Will I be attacked by big climate sceptic bloggers? Will I be deluged by insults in the comments, or unpleasant emails, from those who want me to tell a different story about climate change? More worryingly for my career, will I be seen by other climate scientists as an uppity young (ahem, youngish) thing, disrespectful or plain wrong about other people’s research? (Most worrying: will anyone return here to read my posts?)

I’m being a little melodramatic. But in the past year I’ve thought a lot about Fear. Like many, I sometimes find myself with imposter syndrome, the fear of being found out as incompetent, which is “commonly associated with academics”. But I’ve also been heartened by recent blog posts encouraging us to face fears of creating, and of being criticised, such as this by Gia Milinovich (a bit sweary):

“You have to face your fears and insecurity and doubt. [...] That’s scary. That’s terrifying. But doing it will make you feel alive.”

Fear is a common reaction to climate change itself. A couple of days ago I had a message from an old friend that asked “How long until we’re all doomed then?” It was tongue-in-cheek, but there are many that are genuinely fearful. Some parts of the media emphasise worst case scenarios and catastrophic implications, whether from a desire to sell papers or out of genuine concern about the impacts of climate change. Some others emphasise the best case scenarios, reassuring us that everything will be fine, whether from a desire to sell papers or out of genuine concern and frustration about the difficulties of tackling climate change.

Never mind fear: it can all be overwhelming, confusing, repetitive. You might want to turn the page, to change the channel. Sometimes I’m the same.

I started blogging to try and find a new way of talking about climate science. The title of my blog is taken from a quote by a statistician:

“essentially, all models are wrong, but some are useful” - George E. P. Box (b 1919)

By “model” I mean any computer software that aims to simulate the Earth’s climate, or parts of the planet (such as forests and crops, or the Antarctic ice sheet), which we use to try to understand and predict climate changes and their impacts in the past and future. These models can never be perfect; we must always keep this in mind. On the other hand, these imperfections do not mean they are useless. The important thing is to understand their strengths and limitations.

I want to focus on the process, the way we make climate predictions, which can seem mysterious to many (including me, until about a month before starting my first job). I don’t want to try and convince you that all the predictions are doom and gloom, or conversely that everything is fine. Instead I want to tackle some of the tricky scientific questions head-on. How can we even try to predict the future of our planet? How confident are we about these predictions, and why? What could we do differently?

When people hear what I do, one of the first questions they ask is often this:

“How can we predict climate change in a hundred years, when we can’t even predict the weather in two weeks?”

To answer this question we need to define the difference between climate and weather. Here’s a good analogy I heard recently, from J. Marshall Shepherd

“Weather is like your mood. Climate is like your personality.”

And another from John Kennedy:

“Practically speaking: weather’s how you choose an outfit, climate’s how you choose your wardrobe.”

Climate, then, is long-term weather. More precisely, climate is the probability of different types of weather.

Why is it so different to predict those two things? I’m going to toss a coin four times in a row. Before I start, I want you to predict what the four coin tosses are going to be: something like “heads, tails, heads, tails”. If you get it right, you win the coin*. Ready?

[ four virtual coin tosses...]

50p coin on cafe table

[ ...result is tails, tails, tails, heads ]

Did you get it right? I’m a nice person, so I’m going to give you another chance. I’m going to ask: how many heads in the next four?

four more virtual coin tosses... ]

 

__________

 

...results is two heads out of four ]

The first of these is like predicting weather, and the second like climate. Weather is a sequence of day-by-day events, like the sequence of heads and tails. (In fact, predicting a short sequence of weather is a little easier than predicting coin tosses, because the weather tomorrow is often similar to today). Climate is the probability of different types of weather, like the probability of getting heads.

If everything stays the same, then the further you go into the future, the harder it is to predict an exact sequence and the easier it is to predict a probability. As I’ll talk about in later posts, everything is not staying the same… But hopefully this shows that trying to predict climate is not an entirely crazy idea in the way that the original question suggests.

My blog posts here at PLOS will be about common questions and misunderstandings in climate science, topical climate science news, and my own research. They won’t be about policy or what actions we should take. I will maintain my old blog allmodelsarewrong.com: all posts at PLOS will also be mirrored there, and some additional posts that are particularly technical or personal might only be posted there.

At my old blog we’ve had interesting discussions between people from across the spectrum of views, and I hope to continue that here. To aid this I have a firm commenting policy:

  • be civil; do not accuse; do not describe anyone as a denier (alternatives: sceptic, dissenter, contrarian), liar, fraud, or alarmist; do not generalise or make assumptions about others;
  • interpret comments in good faith; give others the benefit of the doubt; liberally sprinkle your comments with good humour, honesty, and, if you like them, cheerful emoticons, to keep the tone friendly and respectful;
  • stay on-topic.

I’m extremely happy to support PLOS in their commitments to make science accessible to all and to strengthen the scientific process by publishing repeat studies and negative results. I’m also very grateful to everyone that has supported and encouraged me over the past year: climate scientists and sceptics, bloggers and Tweeters. Thank you all.

And thank for you reading. My next post will be about another big question in climate science:

How can we do scientific experiments on our planet?

See you next time.

* You don’t, but if you were a volunteer at one of my talks you would.

Many dimensions to life and science

This post is timed to coincide with a meeting tomorrow, the Royal Meteorological Society’s “Communicating Climate Science”. If you are going, do come and say hello. If you aren’t, look out for me tweeting about it from 2-5.30pm BST.

On not blogging

I haven’t forgotten about you. I’ve still been churning over ideas and wanting to share them with you. I’ve thought of all of you that comment here, and those that silently lurk, whether friends, family, scientists, sceptics, passers-by, or a combination of these. But two big things this year have had to take priority over blogging (and the even more time-consuming process of moderating and replying to comments).

The first was a deadline. As some of you know well, the Intergovernmental Panel on Climate Change (IPCC) produces a report summarising the state-of-the-art in climate science research, and related topics, about every six years. They do this so policymakers have a handy (in practice, enormous and not very handy) reference to the evidence base and latest predictions. The IPCC set cut-off dates for including new research: one date for submission to journals, and another for acceptance after the peer-review process. The first of these dates was the 31st July this year. Translation: “try to finish and write up every piece of work you’ve ever started by this date”. Not every climate scientist chose to do this. But the project I work for, ice2sea, actually had it written into a contract with its funders, the European Union. We had no choice but to submit whatever was our current state-of-the-art in sea level predictions. I was a co-author of six papers* finished and submitted during June and July, and had several other studies on the go that didn’t make the deadline. So it was a rather intense time, and science had to take priority over talking about science.

The second was personal. I hesitated about whether to say this here. But part of my motivation for being a climate scientist in the public eye was to show the human side. And I also wanted to let you know that this blog is so important to me, has been so transformative, that it took something very big to keep me away. My husband and I separated two months ago.

I’m back, and I’m preparing for a big move. The US-based publisher and organisation PLoS (Public Library of Science) has invited me to be their climate blogger. It’s a fantastic opportunity to gain a big audience (more than 200,000 visitors per month, and a feed to Google News). I’m very happy to support PLoS because they publish open access journals, and because one of these (PLoS ONE) goes even further in its commitment to transparency in science. It will publish anything scientifically valid, whether or not it is novel. This might not sound important, or even a good idea, but it is an essential counter to the modern problem that plagues journals: that of only publishing new results, and not repeat studies. For the scientific method to work, we need studies that repeat and reproduce (or contradict) previous research. Otherwise we risk errors, chance findings, and very occasionally fraud, remaining unnoticed for years, or forever. I’m hosted at PLoS from the second week in December and will be posting twice a month.

The first post at PLoS will be a (long overdue) introduction to predicting climate change. It will probably be based around a talk I gave at the St Paul’s Way summer science school, at which I was the final speaker, which made Prof Brian Cox my warm-up act.

In other news, I talked about the jet stream and climate change live on BBC Wiltshire (9 mins), which was well received at the climate sceptic site Bishop Hill, and did a live Bristol radio show, Love and Science (1 hour). I also returned to my particle physics roots, with a Radio 4 interview about the discovery of the Higgs Boson (3 mins).

Our new(-ish) paper

Now the science bit. This is an advertisement for a paper we published in August:

Stephens E.M., Edwards T.L. and Demeritt D. (2012). Communicating probabilistic information from climate model ensembles—lessons from numerical weather prediction. WIREs Clim Change 2012, 3: 409-426.

It’s paywalled, but I can send a copy to individuals if they request it. Liz Stephens is a colleague and friend from my department at Bristol that did a great study with the UK Met Office and David Spiegelhalter on the interpretation of probability-based weather forecasts, using an online game about an ice cream man. I’ve never met David Demeritt, except in one or two Skype video calls. He’s interested in, amongst other things, how people interpret flood forecasts. I haven’t passed this post by them, but hopefully they will comment below if they have things to add or correct.

We noticed there was quite a bit of research on how well people understand and make decisions using weather forecasts, such as the probability of rainfall, and uncertainty in hurricane location, but not much on the equivalents in climate change. There have been quite a few papers, particularly in the run-up to the new IPCC report, that talk in general terms about how people typically interpret probability, uncertainty and risk, and about some of the pitfalls to avoid when presenting this information. But very few actual studies on how people interpret and make decisions from climate change predictions specifically. We thought we’d point this out, and draw some comparisons with other research areas, including forecasting of hurricanes, rain, and flooding.

Ensembles

The ‘ensembles’ in the title are a key part of predicting climate and weather. An ensemble is a group, a sample of different possibilities. Weather forecasts have been made with ensembles for many years, to help deal with the problem of our chaotic atmosphere. The most well-known explanation of chaos is the ‘butterfly effect’. If a butterfly stamps its foot in Brazil, could it cause a tornado in Illinois? Chaos means: small changes can have a big effect. A tiny change in today’s weather could lead to completely different weather next week. And in the same way, a tiny error in our measurements of today’s weather could lead to a completely different forecast of the weather next week. But errors and missing measurements are inevitable. So we try to account for chaotic uncertainty by making forecasts based on several slightly different variations on today’s weather. This is one type of ‘ensemble forecast’. It’s simply a way of dealing with uncertainty. Instead of one prediction, we make many. We hope that the ensemble covers the range of possibilities. Even better, we hope that the most common prediction in the ensemble (say, 70% of them predict a storm) is actually the most likely thing to happen. This gives us an estimate of the probability of different types of weather in the future.

Ensembles are at the heart of our attempts to describe how sure we are about our predictions. They are used to explore an uncertain future: what are the bounds of possibility? What is plausible, and what is implausible? Some climate prediction ensembles, like the weather forecast ensemble above, relate to the information we feed into the model. Others relate to imperfections in the models themselves. Some specific examples are in the footnotes below.**

The question we ask in our paper is: how should we express these big, complex ensemble predictions? There are too many dimensions to this problem to fit on a page or screen. Our world is three dimensional. Add in time, and it becomes four. There are very many aspects of climate to consider, such as air temperature, rainfall, air pressure, wind speed, cloud cover, and ocean temperature. We might have a prediction for each plausible input value, and a prediction for each plausible variation of the model itself. And one of these ensembles is produced for each of the different climate models around the world. Frankly, ensembles are TMI***.

To simplify or not to simplify

Scientists often think that the more information they can give, the better. So they dump all the raw ensemble predictions on the page. It’s a natural instinct: it feels transparent, honest, allows people to draw their own conclusions. The problem is, people are a diverse bunch. Even within climate science, they have different knowledge and experience, which affects their interpretation of the raw data. When you broaden the audience to other scientists, to policymakers, businesses, the general public, you run the risk of generating as many conclusions as there are people. Worse still, some can be overwhelmed by a multitude of predictions and ask “Which one should I believe?”

To avoid these problems, then, it seems the expert should interpret the ensemble of predictions and give them in a simplified form. This is the case in weather forecasting, where a meteorologist looks at an ensemble forecast and translates it based on their past experience. It works well because their interpretations are constantly tested against reality. If a weather forecaster keeps getting it wrong, they’ll be told about it every few hours.

This doesn’t work in climate science. Climate is long-term, a trend over many years, so we can’t keep testing the predictions. If we simplify climate ensembles too much, we risk hiding the extent of our uncertainty.

Our conclusions can be summed up by two sentences:

a) It is difficult to represent the vast quantities of information from climate ensembles in ways that are both useful and accurate.

b) Hardly anyone has done research into what works.

We came up with a diagram to show the different directions in which we’re pulled when putting multi-dimensional ensemble predictions down on paper. These directions are:

  1. “richness”: how much information we give from the predictions, i.e. whether we simplify or summarise them. For example, we could show a histogram of all results from the ensemble, or we could show just the maximum and minimum.
  2. “saliency”****: how easy it is to interpret and use the predictions, for a particular target audience. Obviously we always want this to be high, but it doesn’t necessarily happen.
  3. “robustness”: how much information we give about the limitations of the ensemble. For example, we can list all the uncertainties that aren’t accounted for. We can show maps in their original pixellated (low resolution) form, like the two maps shown below, rather than a more ‘realistic-looking’ smoothed version, like these examples.

Here’s the diagram:

The three ‘dimensions’ are connected with each other, and often in conflict. Where you end up in the diagram depends on the target audience, and the nature of the ensemble itself. Some users might want, or think they want, more information (richness and robustness) but this might overwhelm or confuse them (saliency). On the other hand, climate modellers might reduce the amount of information to give a simpler representation, hoping to improve understanding, but this might not accurately reflect the limitations of the prediction.

In some cases it is clear how to strike a balance. I think it’s important to show the true nature of climate model output (blocky rather than smoothed maps), even if they are slightly harder to interpret (you have to squint to see the overall patterns). Otherwise we run the risk of forgetting that – cough – all models are wrong.

But in other cases it’s more difficult. Giving a map for every individual prediction in the ensemble, like this IPCC multi-model example, shows the extent of the uncertainty. But if this is hundreds or thousands of maps, is this still useful? Here we have to make a compromise: show the average map, and show the uncertainty in other ways. The IPCC deals with this by “stippling” maps in areas where the ensemble predictions are most similar; perhaps the unstippled areas still look quite certain to the hasty or untrained eye. I like the suggestion of Neil Kaye, fading out the areas where the ensemble predictions disagree (examples of both below).


This brings us to the second point of our conclusions. The challenge is to find the right balance between these three dimensions: to understand how the amount of information given, including the limitations of the ensemble, affects the usefulness for various audiences. Do people interpret raw ensemble predictions differently to simplified versions of the same data? Do full ensemble predictions confuse people? Do simplifications lead to overconfidence?

There is very little research on what works. In forecasting rainfall probabilities and hurricanes, there have been specific studies to gather evidence, like workshops to find out how different audiences make decisions when given different representations of uncertainty. People have published recommendations for how to represent climate predictions, but these are based on general findings from social and decision sciences. We need new studies that focus specifically on climate. These might need to be different to those in weather-related areas for two reasons. First, people are given weather forecasts every day and interpret them based on their past experiences. But they are rarely given climate predictions, and have no experience of their successes and failures because climate is so long-term. Second, people’s interpretation of uncertain predictions may be affected by the politicisation of the science.

To sum up: we can learn useful lessons from weather forecasting about the possible options for showing multi-dimensional ensembles on the page, and about ways to measure what works. But the long-term nature of climate creates extra difficulties in representing predictions, just as it does in making them.

 

* Papers submitted for the IPCC Fifth Assessment Report deadline:

  • Ritz, C., Durand, G., Edwards, T.L., Payne, A.J., Peyaud, V. and Hindmarsh, R.C.A. Bimodal probability of the dynamic contribution of Antarctica to future sea level. Submitted to Nature.
  • Shannon, S.R., A.J. Payne, I.D. Bartholomew, M.R. van den Broeke, T.L. Edwards, X. Fettweis, O. Gagliardini, F. Gillet-Chaulet, H. Goelzer, M. Hoffman, P. Huybrechts, D. Mair, P. Nienow, M. Perego, S.F. Price, C.J.P.P Smeets, A.J. Sole, R.S.W. van de Wal and T. Zwinger. Enhanced basal lubrication and the contribution of the Greenland ice sheet to future sea level rise. Submitted to PNAS.
  • Goelzer, H., P. Huybrechts, J.J. Fürst, M.L. Andersen, T.L. Edwards, X. Fettweis, F.M. Nick, A.J. Payne and S. Shannon. Sensitivity of Greenland ice sheet projections to model formulations. Submitted to Journal of Glaciology.
  • Nick, F.M., Vieli, A., Andersen, M.L., Joughin, I., Payne, A.J., Edwards, T.L., Pattyn, F. and Roderik van de Wal. Future sea-level rise from Greenland’s major outlet glaciers in a warming climate. Submitted to Nature.
  • Payne, A.J., S.L. Cornford, D.F. Martin, C. Agosta, M.R. van den Broeke, T.L. Edwards, R.M. Gladstone, H.H. Hellmer, G. Krinner, A.M. Le Brocq, S.M. Ligtenberg, W.H. Lipscomb, E.G. Ng, S.R. Shannon , R. Timmerman and D.G. Vaughan. Impact of uncertainty in climate forcing on projections of the West Antarctic ice sheet over the 21st and 22nd centuries. Submitted to Earth and Planetary Science Letters.
  • Barrand, N.E., R.C.A. Hindmarsh, R.J. Arthern, C.R. Williams, J. Mouginot, B. Scheuchl, E. Rignot, S. R.M. Ligtenberg, M, R. van den Broeke, T. L. Edwards, A.J. Cook, and S. B. Simonsen. Computing the volume response of the Antarctic Peninsula ice sheet to warming scenarios to 2200. Submitted to Journal of Glaciology.

** Some types of ensemble are:

  1. ‘initial conditions’: slightly different versions of today’s weather, as in the weather forecasting example above
  2. ‘scenarios’: different possible future storylines, e.g. of greenhouse gas emissions
  3. ‘parameters’: different values for the control dials of the climate model, which affect the behaviour of things we can’t include as specific physical laws
  4. ‘multi-model’: different climate models from the different universities and meteorological institutes around the world

*** Too Much Information

**** Yes, we did reinvent a word, a bit. 

Push button to talk to a scientist

My apologies for the lack of posts recently. I have plenty of topics planned, but no free time right now. Service will resume shortly (ish).

 

Limitless possibilities

Mark Maslin and Patrick Austin at University College London have just had a comment published in Nature called “Climate models at their limit?”. This builds on the emerging evidence that the latest, greatest climate predictions, which will be summarised in the next assessment report of the Intergovernmental Panel on Climate Change (IPCC AR5, 2013) are not going to tell us anything too different from the last report (AR4, 2007) and in fact may have larger uncertainty ranges.

I’d like to discuss some of the climate modelling issues they cover. I agree with much of what they say, but not all…

1. Models are always wrong

Why do models have a limited capability to predict the future? First of all, they are not reality….models cannot capture all the factors involved in a natural system, and those that they do capture are often incompletely understood.

A beginning after my own heart! This is the most important starting point for discussing uncertainty about the future.

Climate modellers, like any other modellers, are usually well aware of the limits of their simulators*. The George Box quote from which this blog is named is frequently quoted in climate talks and lectures. But sometimes simulators are implicitly treated as if they were reality: this happens when a climate modeller has made no attempt to quantify how wrong it is, or does not know how to, or does not have the computing power to try out different possibilities, and throws their hands up in the air. Or perhaps their scientific interest is really in testing how the simulator behaves, not in making predictions.

For whatever reason, this important distinction might be temporarily set aside. The danger of this is memorably described by Jonty Rougier and Michel Crucifix**:

One hears “assuming that the simulator is correct” quite frequently in verbal presentations, or perceives the presenter sliding into this mindset. This is so obviously a fallacy that he might as well have said “assuming that the currency of the US is the jam doughnut.”

Models are always wrong, but what is more important is to know how wrong they are: to have a good estimate of the uncertainty about the prediction. Mark and Patrick explain that our uncertainties are so large because climate prediction is a chain of very many links. The results of global simulators are fed into regional simulators (for example, covering only Europe), and the results of these are fed into another set of simulators to predict the impacts of climate change on sea level, or crops, or humans. At each stage in the chain the range of possibilities branches out like a tree: there are many global and regional climate simulators, and several different simulators of impacts, and each simulator may be used to make multiple predictions if they have parameters (which can be thought of as “control dials”) for which the best settings are not known. And all of this is repeated for several different “possible futures” of greenhouse gas emissions, in the hope of distinguishing the effect of different actions.

2. Models are improving

“The climate models…being used in the IPCC’s fifth assessment make fewer assumptions than those from the last assessment…. Many of them contain interactive carbon cycles, better representations of aerosols and atmospheric chemistry and a small improvement in spatial resolution.”

Computers are getting faster. Climate scientists are getting a better understanding of the different physical, chemical and biological processes that govern our climate and the impacts of climate change, like the carbon cycle or the response of ice in Greenland and Antarctica to changes in the atmosphere and oceans. So there has been a fairly steady increase in resolution***, in how many processes are included, and in how well those processes are represented. In many ways this is closing the gap between simulators and reality. This is illustrated well in weather forecasting: if only they had a resolution of 1km instead of 12km, the UK Met Office might have predicted the Boscastle flood in 2004 (page 2 of this presentation).

But the other side of the coin are, of course, the “unknown unknowns” that become “known unknowns”. The things we hadn’t thought of. New understanding that leads to an increase in uncertainty because the earlier estimates were too small.

Climate simulators are slow, as slow as one day to simulate two or three model years, several months for long simulations. So modellers and their funders must decide where to spend their money: high resolution, more processes, or more replications (such as different parameter settings). Many of those of us who spend our working hours, and other hours, thinking about uncertainty, strongly believe the climate modelling community must not put resolution and processes (to improve the simulator) above generating multiple predictions (to improve our estimates of how wrong the simulator is). Jonty and Michel again make this case**:

Imagine being summoned back in the year 2020, to re-assess your uncertainties in the light of eight years of climate science progress. Would you be saying to yourself, “Yes, what I really need is an ad hoc ensemble of about 30 high-resolution simulator runs, slightly higher than today’s resolution.” Let’s hope so, because right now, that’s what you are going to get.

But we think you’d be saying, “What I need is a designed ensemble, constructed to explore the range of possible climate outcomes, through systematically varying those features of the climate simulator that are currently ill-constrained, such as the simulator parameters, and by trying out alternative modules with qualitatively different characteristics.”

Higher resolution and better processes might close the gap between the simulator and reality, but if it means you can only afford the computing power to run one simulation then you are blind as to how small or large that gap may be. Two examples of projects that do place great importance on multiple replications and uncertainty are the UK Climate Projections and ClimatePrediction.net.

3. Models agree with each other

None of this means that climate models are useless….Their vision of the future has in some ways been incredibly stable. For example, the predicted rise in global temperature for a doubling of CO2 in the atmosphere hasn’t changed much in more than 20 years.

This is the part of the modelling section I disagree with. Mark and Patrick argue that consistency in predictions through the history of climate science (such as the estimates of climate sensitivity in the figure below) is an argument for greater confidence in the models. Of course inconsistency would be a pointer to potential problems. If changing the resolution or adding processes to a GCM wildly changed the results in unexpected ways, we might worry about whether they were reliable.

But consistency is only necessary, not sufficient, to give us confidence. Does agreement imply precision? I think instinctively most of us would say no. The majority of my friends might have thought the Manic Street Preachers were a good band, but it doesn’t mean they were right.

In my work with Jonty and Mat Collins, we try to quantify how similar a collection of simulators are to reality. This is represented by a number we call ‘kappa’, which we estimate by comparing simulations of past climate to reconstructions based on proxies like pollen. If kappa equals one, then reality is essentially indistinguishable from the simulators. If kappa is greater than one, then it means the simulators are more like each other than they are like reality. And our estimates of kappa so far? Are all greater than one. Sometimes substantially.

The authors do make a related point earlier in the article:

Paul Valdes of Bristol University, UK, argues that climate models are too stable, built to ‘not fail’ rather than to simulate abrupt climate change.

Many of the palaeoclimate studies by BRIDGE (one of my research groups) and others show that simulators do not respond much to change when compared with reconstructions of the past. They are sluggish, and stable, and not moved easily from the present day climate. This could mean that they are underestimating future climate change.

In any case, either sense of the word ‘stability’ – whether consistency of model predictions or the degree to which a simulator reacts to being prodded – is not a good indicator of model reliability.

Apart from all this, the climate sensitivity estimates (as shown in their Figure) mostly have large ranges so I would argue in that case that consistency did not mean much…

Figure 1 from Maslin and Austin (2012), Nature.

Warning: here be opinions

Despite the uncertainty, the weight of scientific evidence is enough to tell us what we need to know. We need governments to go ahead and act…We do not need to demand impossible levels of certainty from models to work towards a better, safer future.

This being a science and not a policy blog, I’m not keen to discuss this last part of the article and would prefer your comments below not to be dominated by this either. I would only like to point out, to those that have not heard of them, the existence (or concept) of “no-regrets” and “low-regrets” options. Chapter 6 of the IPCC Special Report on ‘Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation (SREX)’ describes them:

Options that are known as ‘no regrets’ and ‘low regrets’ provide benefits under any range of climate change scenarios…and are recommended when uncertainties over future climate change directions and impacts are high.

Many of these low-regrets strategies produce co-benefits; help address other development goals, such as improvements in livelihoods, human well-being, and biodiversity conservation; and help minimize the scope for maladaptation.

No-one could argue against the aim of a better, safer future. Only (and endlessly) about the way we get there. Again I ask, please try to stick to on-topic and discuss science below the line.

Update 14/6/12: The book editors are happy for Jonty to make their draft chapter public: http://www.maths.bris.ac.uk/~mazjcr/climPolUnc.pdf

 

*I try to use ‘simulator’, because it is a more specific word than ‘model’. I will also refer to climate simulators by their most commonly-used name: GCMs, for General Circulation Models.

**”Uncertainty in climate science and climate policy”, chapter contributed to “Conceptual Issues in Climate Modeling”, Chicago University Press, E. Winsberg and L. Lloyd eds, forthcoming 2013. See link above.

***Just like the number of pixels of a digital camera, the resolution of a simulator is how much detail it can ‘see’. In the climate simulator I use, HadCM3, the pixels are about 300km across, so the UK is made of just a few. In weather simulators, the pixels are approaching 1km in size.

 

 

A Sensitive Subject

Grab yourself a cup of tea. This is a long one.

Yesterday was historic for me. It was the first time I presented a result about ‘climate sensitivity’ (more on this later). This is how it felt to get that result two weeks ago:

In June 2006, we were just a bunch of bright-eyed, bushy-tailed researchers, eager to make a difference in the brave new world of “using palaeodata to reduce uncertainties in climate prediction”. Little did we know the road would be so treacherous, so windy, and so, so long….

Many years later, when we were all old and grey, we finally reached the first of our goals: a preliminary result. On Thursday 12th April 2012, Dr Jonathan C. Rougier produced a plot named, simply, ‘sensitivity.pdf’…

I was presenting these results in a session at the big (11  000 participants) annual European Geosciences Union conference in Vienna. The first speaker was James Hansen, who is rather big in climate circles, and the second David Stainforth, who is first author of the first big climateprediction.net result (they dish out climate models for people to run in the background on their computers). I was third, slightly rattled from only finishing my slides just before the session and running through them only once.

If any of you were at the session, I’d prefer you not to talk about our final result, at least for now…it is so preliminary, and I’d prefer this to be a ‘black box’ discussion of our work without prejudice or assumptions from our preliminary numbers.

A little history…

Our project (my first climate job since leaving particle physics) had the rather lovely name of PalaeoQUMP * and the aim of reducing uncertainty about climate sensitivity. By ‘reducing uncertainty’ I mean making the error bars smaller, pinning down the range in which we think the number lies. Climate sensitivity is the global warming you would get if you doubled the concentrations of carbon dioxide in the atmosphere. The earth is slow at reacting to change, so you have to wait until the temperature has stopped changing. Svante Arrhenius (Swedish scientist, 1859-1927) had a go at this in 1896. He did “tedious calculations” by hand and came up with 5.5degC. He added that this was probably too high, and in 1901 revised it to 4degC.

The idea was to reproduce the method of the original lovely-named project QUMP, the internal name given to the Met Office Hadley Centre research into Quantifying Uncertainty in Model Predictions. They compared a large group of climate model simulations with observations of recent climate, to see which were the most realistic and therefore which were more likely to be reliable for predicting the future. QUMP was the foundation for the UK Climate Projections, which provide “information designed to help those needing to plan how they will adapt to a changing climate”. We planned to repeat their work, but looking much further back in time – using what knowledge we have of the climate 6000 years ago (the ‘Mid-Holocene’) and 21 000 years ago (the height of the last ice age, or ‘Last Glacial Maximum’), instead of the recent past.

Fairly early into this project I wrote - with Michel Crucifix and Sandy Harrison – a review paper about people’s efforts to estimate climate sensitivity, which I’ve just put on arxiv.org because I support open science.

PalaeoQUMP ended in 2010 without us publishing any scientific results, for a variety of reasons: ambitious aims, loss of collaborators from the project, and my own personal reasons. Two of the original members – Jonty Rougier (statistician) and Mat Collins (climate modeller, formerly at the Met Office Hadley Centre) – and I continued to work with our climate simulations when we found time. We got distracted along the way from the original goal of climate sensitivity by interesting questions about how best to learn about past climates, but pootled along happily.

But late last year a group of scientists led by Andreas Schmittner published a result that was very similar to our original plan: comparing a large number of climate model simulations to information about the Last Glacial Maximum to try and reduce the uncertainty in climate sensitivity. Their result certainly had a small uncertainty, and it was also much lower than most people had found previously: a 90% probability of being in the range 1.4 to 2.8 degC. This sent a mini-ripple around people interested in climate sensitivity, palaeoclimate and future predictions. The authors were quite critical of their own work, making the possible weak points clear. One of the main weaknesses was that their method needed a very large number of simulations, so they had to use a climate model with a very simple representation of the atmosphere (because it is faster to run). They invited others to repeat their method and test it.

So we took up the gauntlet…

We have a group, an ensemble, of 17 versions of a climate model. The model is called HadCM3, which is a fairly old (but therefore quite fast and well-understood) version of the Hadley Centre climate model. It has a much better representation of the atmosphere than the one used by Andreas Schmittner. In this case “better” is not too controversial: we have atmospheric circulation, they don’t.

We created the different model versions by changing the values of the ‘input parameters’. These are control dials that change the way the model behaves. Unfortunately we don’t know the correct setting for these dials, for lots of reasons: we don’t have the right observations to test them with, or a setting that gives good simulations of temperature might be a bad setting for rainfall. So these are uncertain parameters and we use lots of different settings to create a group of model versions which are all plausible. This group is known as a perturbed parameter ensemble.

We use the ensemble to simulate the Last Glacial Maximum (LGM), the preindustrial period (as a reference), and a climate the same as the preindustrial but with double the CO2 concentrations (to calculate climate sensitivity). We can then compare the LGM simulations to reconstructions of the LGM climate. These reconstructions are based on fossilised plants and animals: by looking at the kinds of species that were fossilised (e.g. something that likes cold climates) and where they lived (e.g. further south than they live today), it is possible to get a surprisingly consistent picture of climates of the past. Reconstructing past climates is difficult, and it’s even harder to estimate the uncertainty, the error bars. I won’t discuss these difficulties in this particular post, and generalised attacks on you know who will not be tolerated in the comments! We used reconstructions of air temperature based on pollen ** and reconstructions of sea surface temperatures based on numerous bugs and things. Andreas Schmittner and his group used the same.

We’re using a shiny new statistical method from Jonty Rougier and two collaborators, which has not yet been published (still in review) but is available online if you want to deluge yourself with charmingly written but quite tricky statistics. It’s a general and simple (compared with previous approaches) way to deal with the very title of this blog: the wrongness of models. The description below is full of ‘saying that’, ‘judge’, ‘reckon’ and so on. Statistics, and science, are full of ‘judgements’: yes, subjectivity. We have to simplify, approximate, and guess-to-the-best-of-our-abilities-and-knowledge. A lot of the statements below are not “This Is The Truth” but more “This Is What We Have Decided To Do To Get An Answer And In Future Work These Decisions May Change”. Please bear this in mind!

Think of an ensemble of climate simulations of temperature. These might be from one model with lots of different values for the control parameters, or they might be completely different models from different research institutes. Most of them look vaguely similar to each other. One is a bit of an oddity. Two look nearly identical. Here is a slightly abstract picture of this:

The crosses in the picture are mostly the same sort of distance from the centre spot, but in different places. One is quite a lot further out. Two are practically on top of each other.

How should we combine all these together to estimate the real temperature? A simple average of everything? Do we give the odd-one-out a smaller contribution? Do we give the near-identical ones smaller contributions too? What if a different model is an oddity for rainfall? Even if we come up with different contributions, different weightings, for each model, the real problem is often relating these back to the original “design” of the ensemble. If our model only has one uncertain parameter, it’s easy. We can steadily increase that control dial for each of the different simulations. Then we compare all the simulations to the real world, find the “best” setting for that parameter, and use this for predicting future climate. This is easy because we know the relationship between each version of the model: each one has a slightly higher setting of the parameter. But if we have a lot of uncertain parameters, it is much harder to find the best settings for all of them at once. It is even worse if we have an ensemble of models from different research institutes, which each have a lot of different uncertain parameters and it is impossible to work out a relationship between all the models.

These problems have given statisticians headaches for several years. We like statisticians, so we want to give them a nice cup of tea and an easier life.

Jonty and Michael and Leanna’s method tries to do make life easier, and begins by asking the question the other way round. Can we throw out some of the models so that the ones that are left are all similar to each other? Then we can stop worrying about how to give them different contributions: we can stop using the individual crosses and just use the average of the rest (the centre spot).

We also don’t need to know the relationship between different models. Instead of using observations of the real world to pick out the “best” model, we will take the average of all of them and let the observations “drag” this average towards reality (I will explain this part later).

How do you decide which models to throw out? This is basically a judgement call. One way is to look at the difference between a model and the average of the others. If any are very far away from the average, chuck them. Another is to squint and look at the simulations and see if any look very different from the others. Yes, really! The point is that it is easier to do this, to justify the decisions, and to use the average, than to decide what contribution to give each model.

The next part to their cunning is reckoning that all the models are equally good – or equally bad, depending on the emptiness or fullness of your glass – at simulating reality. In other words, the models are closer to the ensemble average than reality is. We can add a red star for “reality” outside the cluster of models:

(Notice I’ve now thrown away the outlier model and one of the two near-identical ones.) This is saying that models are probably more like each other than they are like the real world. I think most visitors to this blog would agree…

There is one more decision. The difficulty is not just in combining models but also interpreting the spread in results. Does the ensemble cover the whole range of uncertainty? We think it probably doesn’t, no matter how many models you have or how excellent and cunning your choices in varying the uncertain input parameters. We will say that it does have the same kind of “shape”: maybe the ensemble spread is bigger for Arctic temperatures than for tropical temperatures, so we’ll take that as useful information for the model uncertainty. But we think it should be scaled up, should be multiplied by a number. How much should we scale it up? More on this later…

All of this was just to turn the ensemble into a prediction of LGM temperatures (from the LGM ensemble) and climate sensitivity (from the doubled CO2 ensemble), with uncertainties for each. We will now compare and then combine the LGM temperatures with the reconstructions.

Here is the part where we inflate – actually the technical term, like a balloon – the ensemble spread to give us model uncertainty. How far? The short answer is: until the prediction agrees with the reconstruction. The long answer is a slightly bizarre analogy that comes to mind. Imagine you and a friend are standing about 10 feet apart. You want to hold hands, but you can’t reach. This is what happens if your uncertainties are too small. The prediction and the reconstruction just can’t hold hands; they can’t be friends. Now imagine that you so much want to hold their hand that your arms start growing….growing…growing… until you can reach their hand, perhaps even far enough for a cuddle. You are the model ensemble, and we have just inflated your arms / uncertainty. Your friend is the reconstruction. Your friend’s arms don’t change, because we (choose to) believe the estimates of uncertainty that the reconstruction people give us. But luckily we can inflate your arms, so that now you “agree” with each other. [ For those who want more detail, the hand-holding is a histogram of standardised predictive errors that looks sensible: centred at zero and has most of the mass between [-3,3]. ]

Now we combine the reconstructions with the ensemble “prediction” of the LGM. This gives the best-of-both-worlds. The reconstructions give us information from the real world (albeit more indirect than we would like). The model gives us the link between LGM temperatures and climate sensitivity. The model ensemble and reconstructions are combined in a “fair” way, by taking into account the uncertainties on each side. If the model ensemble has a small uncertainty and the reconstructions have a large uncertainty, then the combined result is closer to the model prediction, and vice versa. This is a weighted average of two things, which is easier than a weighted average of many things (the approach I described earlier). [ Those who want more details: this is essentially a Kalman Filter, but in this context it is known as Bayes Linear updating. ].

To recap:

Reconstructions - we use plant- and bug-based reconstructions of LGM temperatures.

Model prediction – after throwing out models that aren’t very similar, we take the average of the others as our “prediction” of Last Glacial Maximum (LGM) temperatures and climate sensitivity.

Model uncertainty – we multiply the spread of the ensemble by a scaling factor so that the LGM prediction agrees with the reconstructions.

LGM “prediction” – we combine the model prediction with the reconstructions. The combination is closer to whichever has the smallest uncertainty, model or reconstruction.

Now for climate sensitivity. The climate sensitivity gets “dragged” by the reconstructions in the same way as the LGM temperatures. (For this we have to assume that the model uncertainty is the same in the past as the future: this is not at all guaranteed, but inconveniently we don’t have any observations of the future to check). If the LGM “prediction” is generally colder than the LGM reconstructions, it gets dragged to a less-cold LGM and the climate sensitivity gets dragged to a less-warm temperature. And that’s…*jazz hands*….a joint Bayes Linear update of a HadCM3 perturbed parameter ensemble by two LGM proxy-based reconstructions under judgements of ensemble exchangeability and co-exchangeability of reality.

I’m afraid the result itself is going to be a cliffhanger. As I said at the top, I want to talk about the method without being distracted by our preliminary result. But if you’ve got this far…thank you for persevering through my exploratory explanations of some state-of-the-art statistics in climate prediction.

Just as I post this, I am begininning my travels home from Vienna so apologies for comments getting stuck in moderation while I am offline.

Update: I’ve fixed the link to the Rougier et al. manuscript.

Caveat 1. Please note that my descriptions may be a bit over-simplified or, to use the technical term, “hand-wavy”. Our method is slightly different from the statistics manuscript I linked to above, but near enough to be worth reading if you want the technical details. If anyone is keen to see my incomprehensible and stuffed-to-bursting slides, I’ve put them on my Academia.edu page. I’ve hidden the final result of climate sensitivity (and the discussion of it)…

Caveat 2. This work is VERY PRELIMINARY, so don’t tell anyone, ok? Also please be kind – I stayed up too late last night writing this, purely because I am all excited about it.

* Not listed on the PalaeoQUMP website is Ben Booth (who has commented here about his aerosol paper), an honorary member who helped me a lot with the climate modelling.

** N.B. if you want to use the pollen data, contact Pat “Bart” Bartlein for a new version because the old files have a few points with “screwed up missing data codes”, as he put it. These are obvious because the uncertainties are something like 600 degrees.***

*** No jokes about palaeoclimate reconstruction uncertainties please.

How to be Engaging

I’ve started writing my promised post on models used in climate science, but thought I’d get this more topical post out first.

I went to an interesting conference session yesterday on communicating climate science, convened by Asher Minns (Tyndall Centre), Joe Smith (Open University), and Lorraine Whitmarsh (Cardiff University). A few people presented their research into different practices, and the speakers and convenors discussed audience questions afterwards. Paul Stapleton has also blogged about the session here.

A good stand-out point was presented by Mathieu Jahnich: research has found that the public prefer hopeful campaigns (in communicating climate science), not shocking images or negative, hopeless campaigns. I think most of us instinctively know this.

Hebba Haddad, a PhD student from the University of Exeter, spoke on topics close to my heart: the effect of communicating uncertainties in climate science, and the effect of the ‘voice’ in which it is presented. The first relates to the amount of information given about the uncertainty in a prediction: for example, saying “60-80% probability” rather than “70% probability”. The second relates to the phrasing: for example, using the warmer, more friendly and open phrasing of “We…” on an institute website, rather than the cooler, more distant “The centre…”.

She pointed out that scientists, of course, often attempt to transfer as much information as possible (the deficit model - a view that if only enough information were given, people would make rational decisions…), highlight the uncertainties, and use technical language. Science communicators, on the other hand, are more likely to understand their audience, understate uncertainties, convey simpler messages, and use a warmer, friendlier style.

Hebba carried out a study on 152 psychology students. The standout results for me were that:

  1. greater communication of uncertainty reduced belief in climate science;
  2. if little uncertainty is communicated, then the tone makes little difference to the level of engagement;
  3. if a lot of uncertainty is communicated, then a warm tone leads to much greater engagement than a distant tone.

This makes sense: if there is a lot of uncertainty, people use heuristics (short-cuts) to determine their trust in information. These particular students responded well to a personal, friendly tone. And in a later session, someone made the distinction between “relational trust”, which is based on similarity of intentions or values, and ”calculative trust”, or “confidence”, based on past behaviour. They said that in everyday situations people tend to make decisions based on calculative trust, but in unfamiliar situations they use relational trust: another heuristic in times of uncertainty.

But this is interesting, because I think a large part of the audience who visit this blog (thank you) contradict these findings. Your trust in the science increases the more I talk about uncertainty! And I think you place greater importance in “calculative” rather than “relational” trust. In other words, you use the past behaviour of the scientist as a measure of trust, not similarity in values. I’ve found that whenever I talk about limitations of modelling, or challenge statements about climate science and impacts that I believe are not robust, my “trust points” go up because it demonstrates transparency and honesty. (See previous post for squandering of some of those points…). Using a warm, polite tone helps a lot, which supports Hebba’s findings. But I would wager that the degree of similarity to my audience is much less important than my ability to demonstrate trustworthiness.

Lorraine commented that Hebba’s finding of the importance of a warm tone is a challenge for scientists, who are used to talking (particularly writing) in a passive tone: “It was found that…” rather than “We found…”. To combat this, and increase public trust, Joe urged climate scientists to be “energetic digital scholars”, “open” and “public.” He thought we should not try to present climate science as “fact” but as “ambitious, unfolding, and uncertain”.

A US scientist in the audience asked for advice on how to engage online in such a polarised debate, and another audience member asked if giving simple messages (without all uncertainties) might compromise public trust in scientists. Joe kindly invited me to comment on these social media and uncertainty aspects. I speedily dumped the contents of my brain onto the room about how this blog and related efforts, giving a transparent, warts-and-all view of science as an unfolding process, had been very successful in increasing trust. In fact I had so much to say that I was asked to stop, would you believe (er, perhaps you would…).

For those of you that don’t trust the IPCC too much, I merely note that Jean-Pascal van Ypersele tapped me on the shoulder after I spoke about the importance of communicating uncertainties transparently, and asked me to email him the blog link…

Some tweeting about the session led to some lovely supportive messages from across the spectrum of opinions (thank you) and also some criticisms by people you might expect to be supportive. I’ve Storified these below.

And finally, Leo Hickman welcomes our ‘Rapunzel’ approach to communication. I was one of the invited palaeoclimate scientists at that meeting (actually, I probably invited myself), and can confirm it was very civil and productive.

 

Storify of the post-session Twitter conversation:

http://storify.com/flimsin/engaging