Thanks for a comprehensive summary—that was helpful.
It seems that A&G contacted the working scientists to identify papers which (in the scientists’ view) contained the most credible climate forecasts. Not many responded, but 30 referred to the recent (at the time) IPCC WP1 report, which in turn referenced and attempted to summarize over 700 primary papers. There also appear to have been a bunch of other papers cited by the surveyed scientists, but the site has lost them. So we’re somewhat at a loss to decide which primary sources climate scientists find most credible/authoritative. (Which is a pity, because those would be worth rating, surely?)
However, A&G did their rating/scoring on the IPCC WP1, Chapter 8. But they didn’t contact the climate scientists to help with this rating (or they did, but none of them answered?) They didn’t attempt to dig into the 700 or so underlying primary papers, identify which of them contained climate forecasts, and/or had been identified by the scientists as containing the most credible forecasts and then rate those. Or even pick a random sample, and rate those? All that does sound just a tad superficial.
What I find really bizarre is their site’s conclusion that because IPCC got a low score by their preferred rating principles, then a “no change” forecast is superior, and more credible! That’s really strange, since “no change” has historically done much worse as a predictor than any of the IPCC models.
We sent out general calls for experts to use the Forecasting Audit Software to conduct their own audits and we also asked a few individuals to do so. At the time of writing, none have done so.
It’s not clear how much effort they put into this step, and whether e.g. they offered the Forecasting Audit Software for free to people they asked (if they were trying to sell the software, which they themselves created, that might have seemed bad).
My guess is that most of the climate scientists they contacted just labeled them mentally along with the numerous “cranks” they usually have to deal with, and didn’t bother engaging.
I also am skeptical of some aspects of Armstrong and Green’s exercise. But a first outside-view analysis that doesn’t receive much useful engagement from insiders can only go so far. What would have been interesting was if, after Armstrong and Green published their analysis and it was somewhat clear that their critique would receive attention, climate scientists had offered a clearer and more direct response to the specific criticisms, and perhaps even read up more about the forecasting principles and the evidence cited for them. I don’t think all climate scientists should have done so, I just think at least a few should have been interested enough to do it. Even something similar to Nate Silver’s response would have been nice. And maybe that did happen—if so, I’d like to see links. Schmidt’s response, on the other hand, seems downright careless and bad.
My focus here is the critique of insularity, not so much the effect it had on the factual conclusions. Basically, did climate scientists carefully consider forecasting principles (or statistical methods, or software engineering principles) then reject them? Had they never heard of the relevant principles? Did they hear about the principles, but dismiss them as unworthy of investigation? Armstrong and Green’s audit may have been sloppy (though perhaps a first pass shouldn’t be expected to be better than sloppy) but even if the audit itself wasn’t much use, did it raise questions or general directions of inquiry worthy of investigation (or a simple response pointing to past investigation)? Schmidt’s reaction seems evidence in favor of the dismissal hypothesis. And in the particular instance, maybe he was right, but it does seem to fit the general idea of insularity.
(Your quote is mangled, you probably have four spaces at the beginning which makes the rendering engine interpret it as a needing to be formatted like code, i.e. No linebreaks)
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
Co-author Green wrote a paper later claiming that the IPCC models did not do better than the no change model when tested over a broader time period:
But it’s just a draft paper and I don’t know if the author ever plans to clean it up or have it published.
I would really like to see more calibrations and scorings of the models from a pure outside view approach over longer time periods.
Armstrong was (perhaps wrongly) confident enough of his views that he decided to make a public bet claiming that the No Change scenario would beat out the other scenario. The bet is described at:
Overall, I have high confidence in the view that models of climate informed by some knowledge of climate should beat the No Change model, though a lot depends on the details of how the competition is framed (Armstrong’s climate bet may have been rigged in favor of No Change). That said, it’s not clear how well climate models can do relative to simple time series forecasting approaches or simple (linear trend from radiative forcing + cyclic trend from ocean currents) type approaches. The number of independent out-of-sample validations does not seem to be enough and the predictive power of complex models relative to simple curve-fitting models seems to be low (probably negative). So, I think that arguments that say “our most complex, sophisticated models show X” should be treated with suspicion and should not necessarily be given more credence than arguments that rely on simple models and historical observations.
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
There are certainly periods when temperatures moved in a negative direction (1940s-1970s), but then the radiative forcings over those periods (combination of natural and anthropogenic) were also negative. So climate models would also predict declining temperatures, which indeed is what they do “retrodict”. A no-change model would be wrong for those periods as well.
Your most substantive point is that the complex models don’t seem to be much more accurate than a simple forcing model (e.g. calculate net forcings from solar and various pollutant types, multiply by best estimate of climate sensitivity, and add a bit of lag since the system takes time to reach equilibrium; set sensitivity and lags empirically). I think that’s true on the “broadest brush” level, but not for regional and temporal details e.g. warming at different latitudes, different seasons, land versus sea, northern versus southern hemisphere, day versus night, changes in maximum versus minimum temperatures, changes in temperature at different levels of the atmosphere etc. It’s hard to get those details right without a good physical model of the climate system and associated general circulation model (which is where the complexity arises). My understanding is that the GCMs do largely get these things right, and make predictions in line with observations; much better than simple trend-fitting.
P.S. If I draw one supportive conclusion from this discussion, it is that long-range climate forecasts are very likely to be wrong, simply because the inputs (radiative forcings) are impossible to forecast with any degree of accuracy.
Even if we’d had perfect GCMs in 1900, forecasts for the 20th century would likely have been very wrong: no one could have predicted the relative balance of CO2, other greenhouse gases and sulfates/aerosols (e.g. no-one could have guessed the pattern of sudden sulfates growth after the 1940s, followed by levelling off after the 1970s). And natural factors like solar cycles, volcanoes and El Niño/La Nina wouldn’t have been predictable either.
Similarly, changes in the 21st century could be very unexpected. Perhaps some new industrial process creates brand new pollutants with negative radiative forcing in the 2030s; but then the Amazon dies off in the 2040s, followed by a massive methane belch from the Arctic in the 2050s; then emergency geo-engineering goes into fashion in the 2070s (and out again in the 2080s); then in the 2090s there is a resurgence in coal, because the latest generation of solar panels has been discovered to be causing a weird new plague. Temperatures could be up and down like a yo-yo all century.
Thanks for a comprehensive summary—that was helpful.
It seems that A&G contacted the working scientists to identify papers which (in the scientists’ view) contained the most credible climate forecasts. Not many responded, but 30 referred to the recent (at the time) IPCC WP1 report, which in turn referenced and attempted to summarize over 700 primary papers. There also appear to have been a bunch of other papers cited by the surveyed scientists, but the site has lost them. So we’re somewhat at a loss to decide which primary sources climate scientists find most credible/authoritative. (Which is a pity, because those would be worth rating, surely?)
However, A&G did their rating/scoring on the IPCC WP1, Chapter 8. But they didn’t contact the climate scientists to help with this rating (or they did, but none of them answered?) They didn’t attempt to dig into the 700 or so underlying primary papers, identify which of them contained climate forecasts, and/or had been identified by the scientists as containing the most credible forecasts and then rate those. Or even pick a random sample, and rate those? All that does sound just a tad superficial.
What I find really bizarre is their site’s conclusion that because IPCC got a low score by their preferred rating principles, then a “no change” forecast is superior, and more credible! That’s really strange, since “no change” has historically done much worse as a predictor than any of the IPCC models.
See the last sentence in my longer quote:
It’s not clear how much effort they put into this step, and whether e.g. they offered the Forecasting Audit Software for free to people they asked (if they were trying to sell the software, which they themselves created, that might have seemed bad).
My guess is that most of the climate scientists they contacted just labeled them mentally along with the numerous “cranks” they usually have to deal with, and didn’t bother engaging.
I also am skeptical of some aspects of Armstrong and Green’s exercise. But a first outside-view analysis that doesn’t receive much useful engagement from insiders can only go so far. What would have been interesting was if, after Armstrong and Green published their analysis and it was somewhat clear that their critique would receive attention, climate scientists had offered a clearer and more direct response to the specific criticisms, and perhaps even read up more about the forecasting principles and the evidence cited for them. I don’t think all climate scientists should have done so, I just think at least a few should have been interested enough to do it. Even something similar to Nate Silver’s response would have been nice. And maybe that did happen—if so, I’d like to see links. Schmidt’s response, on the other hand, seems downright careless and bad.
My focus here is the critique of insularity, not so much the effect it had on the factual conclusions. Basically, did climate scientists carefully consider forecasting principles (or statistical methods, or software engineering principles) then reject them? Had they never heard of the relevant principles? Did they hear about the principles, but dismiss them as unworthy of investigation? Armstrong and Green’s audit may have been sloppy (though perhaps a first pass shouldn’t be expected to be better than sloppy) but even if the audit itself wasn’t much use, did it raise questions or general directions of inquiry worthy of investigation (or a simple response pointing to past investigation)? Schmidt’s reaction seems evidence in favor of the dismissal hypothesis. And in the particular instance, maybe he was right, but it does seem to fit the general idea of insularity.
(Your quote is mangled, you probably have four spaces at the beginning which makes the rendering engine interpret it as a needing to be formatted like code, i.e. No linebreaks)
Thanks, fixed!
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
Co-author Green wrote a paper later claiming that the IPCC models did not do better than the no change model when tested over a broader time period:
http://www.kestencgreen.com/gas-improvements.pdf
But it’s just a draft paper and I don’t know if the author ever plans to clean it up or have it published.
I would really like to see more calibrations and scorings of the models from a pure outside view approach over longer time periods.
Armstrong was (perhaps wrongly) confident enough of his views that he decided to make a public bet claiming that the No Change scenario would beat out the other scenario. The bet is described at:
http://www.theclimatebet.com/
Overall, I have high confidence in the view that models of climate informed by some knowledge of climate should beat the No Change model, though a lot depends on the details of how the competition is framed (Armstrong’s climate bet may have been rigged in favor of No Change). That said, it’s not clear how well climate models can do relative to simple time series forecasting approaches or simple (linear trend from radiative forcing + cyclic trend from ocean currents) type approaches. The number of independent out-of-sample validations does not seem to be enough and the predictive power of complex models relative to simple curve-fitting models seems to be low (probably negative). So, I think that arguments that say “our most complex, sophisticated models show X” should be treated with suspicion and should not necessarily be given more credence than arguments that rely on simple models and historical observations.
There are certainly periods when temperatures moved in a negative direction (1940s-1970s), but then the radiative forcings over those periods (combination of natural and anthropogenic) were also negative. So climate models would also predict declining temperatures, which indeed is what they do “retrodict”. A no-change model would be wrong for those periods as well.
Your most substantive point is that the complex models don’t seem to be much more accurate than a simple forcing model (e.g. calculate net forcings from solar and various pollutant types, multiply by best estimate of climate sensitivity, and add a bit of lag since the system takes time to reach equilibrium; set sensitivity and lags empirically). I think that’s true on the “broadest brush” level, but not for regional and temporal details e.g. warming at different latitudes, different seasons, land versus sea, northern versus southern hemisphere, day versus night, changes in maximum versus minimum temperatures, changes in temperature at different levels of the atmosphere etc. It’s hard to get those details right without a good physical model of the climate system and associated general circulation model (which is where the complexity arises). My understanding is that the GCMs do largely get these things right, and make predictions in line with observations; much better than simple trend-fitting.
P.S. If I draw one supportive conclusion from this discussion, it is that long-range climate forecasts are very likely to be wrong, simply because the inputs (radiative forcings) are impossible to forecast with any degree of accuracy.
Even if we’d had perfect GCMs in 1900, forecasts for the 20th century would likely have been very wrong: no one could have predicted the relative balance of CO2, other greenhouse gases and sulfates/aerosols (e.g. no-one could have guessed the pattern of sudden sulfates growth after the 1940s, followed by levelling off after the 1970s). And natural factors like solar cycles, volcanoes and El Niño/La Nina wouldn’t have been predictable either.
Similarly, changes in the 21st century could be very unexpected. Perhaps some new industrial process creates brand new pollutants with negative radiative forcing in the 2030s; but then the Amazon dies off in the 2040s, followed by a massive methane belch from the Arctic in the 2050s; then emergency geo-engineering goes into fashion in the 2070s (and out again in the 2080s); then in the 2090s there is a resurgence in coal, because the latest generation of solar panels has been discovered to be causing a weird new plague. Temperatures could be up and down like a yo-yo all century.