Since you are using Real Climate and Skeptical Science as sources, did you read what they had to say about the Armstrong and Green paper and about Nate Silver’s chapter?
Gavin Schmidt’s post was short, funny but rude; however ChrisC’s comment looks much more damning if true. Is it true?
Here is Skeptical Science on Nate Silver.
It seems the main cause of error in Hansen’s early 1988 forecast was an assumed climate sensitivity greater than that of the more recent models and calculations (4.2 degrees rather than 3 degrees). Whereas IPCC’s 1990 forecast had problems predicting the inputs to global warming (amount of emissions, or radiative forcing for given emissions) rather than the outputs (resulting warming). Redoing accounting for these factors removes nearly all the discrepancy.
In light of the portions I quoted from Armstrong and Green’s paper, I’ll look at Gavin Schmidt’s post:
Principle 1: When moving into a new field, don’t assume you know everything about it because you read a review and none of the primary literature.
Score: −2
G+A appear to have only read one chapter of the IPCC report (Chap 8), and an un-peer reviewed hatchet job on the Stern report. Not a very good start…
The paper does cite many other sources than just the IPCC and the “hatchet job” on the Stern Report, including sources that evaluate climate models and their quality in general. ChrisC notes that the author’s fail to cite the ~788 references for the IPCC Chapter 8. The authors claim to have a bibliography on their website that includes the full list of references given to them by all academics who suggested references. Unfortunately, as I noted in my earlier comment, the link to the bibliography from http://www.forecastingprinciples.com/index.php?option=com_content&view=article&id=78&Itemid=107 is broken. This doesn’t reflect well on the authors (the site on the whole is a mess, with many broken links). Assuming, however, that the authors had put up the bibliography and that it was available as promised in the paper, this critique seems off the mark (though I’d have to see the bibliography to know for sure).
Principle 2: Talk to people who are doing what you are concerned about.
Score: −2
Of the roughly 20 climate modelling groups in the world, and hundreds of associated researchers, G+A appear to have talked to none of them. Strike 2.
This seems patently false given the contents of the paper as I quoted it, and the list of experts that they sought. In fact, it seems like such a major error that I have no idea how Schmidt could have made it if he’d read the paper. (Perhaps he had a more nuanced critique to offer, e.g., that the authors’ survey didn’t ask enough questions, or they should have tried harder, or contacted more people. But the critique as offered here smacks of incompetence or malice). [Unless Schmidt was reading an older version of the paper that didn’t mention the survey at all. But I doubt that even if he was looking at an old version of the paper, it omitted all references to the survey.]
Principle 3: Be humble. If something initially doesn’t make sense, it is more likely that you’ve mis-understood than the entire field is wrong.
Score: −2
For instance, G+A appear to think that climate models are not tested on ‘out of sample’ data (they gave that a ‘-2′). On the contrary, the models are used for many situations that they were not tuned for, paleo-climate changes (mid Holocene, last glacial maximum, 8.2 kyr event) being a good example. Similarly, model projections for the future have been matched with actual data – for instance, forecasting the effects of Pinatubo ahead of time, or Hansen’s early projections. The amount of ‘out of sample’ testing is actually huge, but the confusion stems from G+A not being aware of what the ‘sample’ data actually consists of (mainly present day climatology). Another example is that G+A appear to think that GCMs use the history of temperature changes to make their projections since they suggest leaving some of it out as a validation. But this is just not so, as we discussed more thoroughly in a recent thread.
First off, retrospective “predictions” of things that people already tacitly know, even though those things aren’t explicitly used in tuning the models, are not that reliable.
Secondly, it’s possible (and likely) that Armstrong and Green missed some out-of-model tests and validations that had been performed in the climate science arena. While part of this can be laid at their feet, part of it also reflects poor documentation by climate scientists of exactly how they were going about their testing. I did read that IPCC AR4 chapter that Armstrong and Green did, and I found it quite unclear on the forecasting side of things (compared to other papers I’ve read that judge forecast skill, in weather and short-term climate forecasting, macroeconomic forecasting, and business forecasting). This is similar to the sloppy code problem.
Thirdly, the climate scentists whom Armstrong and Green attempted to engage could have been more engaging (not Gavin Schmidt’s fault; he wasn’t included in the list, and the response rate appears to have been low from mainstream scientists as well as skeptics, so it’s not just a problem of the climate science mainstream).
Overall, I’d like to know more details of the survey responses and Armstrong and Green’s methodology, and it would be good if they combined their proclaimed commitment to openness with actually having working links on their websites. But Schmidt’s critique doesn’t reflect too well on him, even if Armstrong and Green were wrong.
Now, to ChrisC’s comment:
Call me crazy, but in my field of meteorology, we would never head to popular literature, much less the figgin internet, in order to evaluate the state of the art in science. You head to the scientific literature first and foremost. Since meteorology and climatology are not that different, I would struggle to see why it would be any different.
The authors also seem to put a large weight on “forecasting principles” developed in different fields. While there may be some valuable advice, and cross-field cooperation is to be encouraged, one should not assume that techniques developed in say, econometrics, port directly into climate science.
The authors also make much of a wild goose chase on google for sites matching their specific phrases, such as “global warming” AND “forecast principles”. I’m not sure what a lack of web sites would prove. They also seem to have skiped most of the literature cited in AR4 ch. 8 on model validation and climatology predictions.
Part of the authors’ criticism was that the climate science mainstream hadn’t paid enough attention to forecasting, or to formal evaluations of forecasting. So it’s natural that they didn’t find enough mainstream stuff to cite that was directly relevant to the questions at hand for them.
As for the Google search and Google Scholar search, these are standard tools for initiating an inquiry. I know, I’ve done it, and so has everybody else. It would be damning if the authors had relied only on such searches. But they surveyed climate scientists and worked their way through the IPCC Working Group Report. This may have been far short of full due diligence, but it isn’t anywhere near as sloppy as Gavin Schmidt and ChrisC make it sound.
Thanks for a comprehensive summary—that was helpful.
It seems that A&G contacted the working scientists to identify papers which (in the scientists’ view) contained the most credible climate forecasts. Not many responded, but 30 referred to the recent (at the time) IPCC WP1 report, which in turn referenced and attempted to summarize over 700 primary papers. There also appear to have been a bunch of other papers cited by the surveyed scientists, but the site has lost them. So we’re somewhat at a loss to decide which primary sources climate scientists find most credible/authoritative. (Which is a pity, because those would be worth rating, surely?)
However, A&G did their rating/scoring on the IPCC WP1, Chapter 8. But they didn’t contact the climate scientists to help with this rating (or they did, but none of them answered?) They didn’t attempt to dig into the 700 or so underlying primary papers, identify which of them contained climate forecasts, and/or had been identified by the scientists as containing the most credible forecasts and then rate those. Or even pick a random sample, and rate those? All that does sound just a tad superficial.
What I find really bizarre is their site’s conclusion that because IPCC got a low score by their preferred rating principles, then a “no change” forecast is superior, and more credible! That’s really strange, since “no change” has historically done much worse as a predictor than any of the IPCC models.
We sent out general calls for experts to use the Forecasting Audit Software to conduct their own audits and we also asked a few individuals to do so. At the time of writing, none have done so.
It’s not clear how much effort they put into this step, and whether e.g. they offered the Forecasting Audit Software for free to people they asked (if they were trying to sell the software, which they themselves created, that might have seemed bad).
My guess is that most of the climate scientists they contacted just labeled them mentally along with the numerous “cranks” they usually have to deal with, and didn’t bother engaging.
I also am skeptical of some aspects of Armstrong and Green’s exercise. But a first outside-view analysis that doesn’t receive much useful engagement from insiders can only go so far. What would have been interesting was if, after Armstrong and Green published their analysis and it was somewhat clear that their critique would receive attention, climate scientists had offered a clearer and more direct response to the specific criticisms, and perhaps even read up more about the forecasting principles and the evidence cited for them. I don’t think all climate scientists should have done so, I just think at least a few should have been interested enough to do it. Even something similar to Nate Silver’s response would have been nice. And maybe that did happen—if so, I’d like to see links. Schmidt’s response, on the other hand, seems downright careless and bad.
My focus here is the critique of insularity, not so much the effect it had on the factual conclusions. Basically, did climate scientists carefully consider forecasting principles (or statistical methods, or software engineering principles) then reject them? Had they never heard of the relevant principles? Did they hear about the principles, but dismiss them as unworthy of investigation? Armstrong and Green’s audit may have been sloppy (though perhaps a first pass shouldn’t be expected to be better than sloppy) but even if the audit itself wasn’t much use, did it raise questions or general directions of inquiry worthy of investigation (or a simple response pointing to past investigation)? Schmidt’s reaction seems evidence in favor of the dismissal hypothesis. And in the particular instance, maybe he was right, but it does seem to fit the general idea of insularity.
(Your quote is mangled, you probably have four spaces at the beginning which makes the rendering engine interpret it as a needing to be formatted like code, i.e. No linebreaks)
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
Co-author Green wrote a paper later claiming that the IPCC models did not do better than the no change model when tested over a broader time period:
But it’s just a draft paper and I don’t know if the author ever plans to clean it up or have it published.
I would really like to see more calibrations and scorings of the models from a pure outside view approach over longer time periods.
Armstrong was (perhaps wrongly) confident enough of his views that he decided to make a public bet claiming that the No Change scenario would beat out the other scenario. The bet is described at:
Overall, I have high confidence in the view that models of climate informed by some knowledge of climate should beat the No Change model, though a lot depends on the details of how the competition is framed (Armstrong’s climate bet may have been rigged in favor of No Change). That said, it’s not clear how well climate models can do relative to simple time series forecasting approaches or simple (linear trend from radiative forcing + cyclic trend from ocean currents) type approaches. The number of independent out-of-sample validations does not seem to be enough and the predictive power of complex models relative to simple curve-fitting models seems to be low (probably negative). So, I think that arguments that say “our most complex, sophisticated models show X” should be treated with suspicion and should not necessarily be given more credence than arguments that rely on simple models and historical observations.
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
There are certainly periods when temperatures moved in a negative direction (1940s-1970s), but then the radiative forcings over those periods (combination of natural and anthropogenic) were also negative. So climate models would also predict declining temperatures, which indeed is what they do “retrodict”. A no-change model would be wrong for those periods as well.
Your most substantive point is that the complex models don’t seem to be much more accurate than a simple forcing model (e.g. calculate net forcings from solar and various pollutant types, multiply by best estimate of climate sensitivity, and add a bit of lag since the system takes time to reach equilibrium; set sensitivity and lags empirically). I think that’s true on the “broadest brush” level, but not for regional and temporal details e.g. warming at different latitudes, different seasons, land versus sea, northern versus southern hemisphere, day versus night, changes in maximum versus minimum temperatures, changes in temperature at different levels of the atmosphere etc. It’s hard to get those details right without a good physical model of the climate system and associated general circulation model (which is where the complexity arises). My understanding is that the GCMs do largely get these things right, and make predictions in line with observations; much better than simple trend-fitting.
P.S. If I draw one supportive conclusion from this discussion, it is that long-range climate forecasts are very likely to be wrong, simply because the inputs (radiative forcings) are impossible to forecast with any degree of accuracy.
Even if we’d had perfect GCMs in 1900, forecasts for the 20th century would likely have been very wrong: no one could have predicted the relative balance of CO2, other greenhouse gases and sulfates/aerosols (e.g. no-one could have guessed the pattern of sudden sulfates growth after the 1940s, followed by levelling off after the 1970s). And natural factors like solar cycles, volcanoes and El Niño/La Nina wouldn’t have been predictable either.
Similarly, changes in the 21st century could be very unexpected. Perhaps some new industrial process creates brand new pollutants with negative radiative forcing in the 2030s; but then the Amazon dies off in the 2040s, followed by a massive methane belch from the Arctic in the 2050s; then emergency geo-engineering goes into fashion in the 2070s (and out again in the 2080s); then in the 2090s there is a resurgence in coal, because the latest generation of solar panels has been discovered to be causing a weird new plague. Temperatures could be up and down like a yo-yo all century.
Here’s a full list of the scientists that Armstrong and Green contacted—the ones who sent a “useful response” are noted parenthetically. Note that of the 51 who responded, 42 were deemed as having given a useful response.
IPCC Working Group 1
Myles Allen, Richard Alley, Ian Allison, Peter Ambenje, Vincenzo Artale, Paulo Artaxo, Alphonsus Baede, Roger Barry, Terje Berntsen, Richard A. Betts, Nathaniel L. Bindoff, Roxana Bojariu, Sandrine Bony, Kansri Boonpragob, Pascale Braconnot, Guy Brasseur, Keith Briffa, Aristita Busuioc, Jorge Carrasco, Anny Cazenave, Anthony Chen (useful response), Amnat Chidthaisong, Jens Hesselbjerg Christensen, Philippe Ciais (useful response), William Collins, Robert Colman (useful response), Peter Cox, Ulrich Cubasch, Pedro Leite Da Silva Dias, Kenneth L. Denman, Robert Dickinson, Yihui Ding, Jean-Claude Duplessy,
David Easterling, David W. Fahey, Thierry Fichefet (useful response), Gregory Flato, Piers M. de F. Forster (useful response), Pierre Friedlingstein, Congbin Fu, Yoshiyuki Fuji, John Fyfe, Xuejie Gao, Amadou Thierno Gaye (useful response), Nathan Gillett (useful response), Filippo Giorgi, Jonathan Gregory (useful response), David Griggs, Sergey Gulev, Kimio Hanawa, Didier Hauglustaine, James Haywood, Gabriele Hegerl (useful response), Martin Heimann (useful response), Christoph Heinze, Isaac Held (useful response), Bruce Hewitson, Elisabeth Holland, Brian Hoskins, Daniel Jacob, Bubu Pateh Jallow, Eystein Jansen (useful response), Philip Jones, Richard Jones, Fortunat Joos, Jean Jouzel, Tom Karl, David Karoly (useful response),
Georg Kaser, Vladimir Kattsov, Akio Kitoh, Albert Klein Tank, Reto Knutti, Toshio Koike, Rupa Kumar Kolli, Won-Tae Kwon, Laurent Labeyrie, René Laprise, Corrine Le Quéré, Hervé Le Treut, Judith Lean, Peter Lemke, Sydney Levitus, Ulrike Lohmann, David C. Lowe, Yong Luo, Victor Magaña Rueda, Elisa Manzini, Jose Antonio Marengo, Maria Martelo, Valérie Masson-Delmotte, Taroh Matsuno, Cecilie Mauritzen, Bryant Mcavaney, Linda Mearns, Gerald Meehl, Claudio Guillermo Menendez, John Mitchell, Abdalah Mokssit, Mario Molina, Philip Mote (useful response), James Murphy, Gunnar Myhre, Teruyuki Nakajima, John Nganga, Neville Nicholls, Akira Noda, Yukihiro Nojiri, Laban Ogallo, Daniel Olago, Bette Otto-Bliesner, Jonathan Overpeck (useful response), Govind Ballabh Pant, David Parker, Wm. Richard Peltier, Joyce Penner (useful response),
Thomas Peterson (useful response), Andrew Pitman, Serge Planton, Michael Prather (useful response), Ronald Prinn, Graciela Raga, Fatemeh Rahimzadeh, Stefan Rahmstorf, Jouni Räisänen, Srikanthan (S.) Ramachandran, Veerabhadran Ramanathan, Venkatachalam Ramaswamy, Rengaswamy Ramesh, David Randall (useful response), Sarah Raper, Dominique Raynaud, Jiawen
Ren, James A. Renwick, David Rind, Annette Rinke, Matilde M. Rusticucci, Abdoulaye Sarr, Michael Schulz (useful response), Jagadish Shukla, C. K. Shum, Robert H. Socolow (useful response), Brian Soden, Olga Solomina (useful response), Richard Somerville (useful response), Jayaraman Srinivasan, Thomas Stocker, Peter A. Stott (useful response), Ron Stouffer, Akimasa Sumi, Lynne D. Talley, Karl E. Taylor (useful response), Kevin Trenberth (useful response), Alakkat S. Unnikrishnan, Rob Van Dorland, Ricardo Villalba, Ian G. Watterson (useful response), Andrew Weaver (useful response), Penny Whetton, Jurgen Willebrand, Steven C. Wofsy, Richard A. Wood, David Wratt, Panmao Zhai, Tingjun Zhang, De’er Zhang, Xiaoye Zhang, Zong-Ci Zhao, Francis Zwiers (useful response)
Union of Concerned Scientists
Brenda Ekwurzel, Peter Frumhoff, Amy Lynd Luers
Channel 4 “The Great Global Warming Swindle” documentary (2007)
Bert Bolin, Piers Corbyn (useful response), Eigil Friis-Christensen, James Shitwaki, Frederick Singer,
Carl Wunsch (useful response)
Wikipedia’s list of global warming “skeptics”
Khabibullo Ismailovich Abdusamatov (useful response), Syun-Ichi Akasofu (useful response), Sallie Baliunas, Tim Ball, Robert Balling (useful response), Fred Barnes, Joe Barton, Joe Bastardi, David Bellamy, Tom Bethell, Robert Bidinotto, Roy Blunt, Sonja Boehmer, Andrew Bolt, John Brignell (useful response), Nigel Calder, Ian Castles (useful response), George Chilingarian, John Christy (useful response), Ian Clark, Philip Cooney, Robert Davis, David Deming (useful response), David Douglass, Lester Hogan, Craig Idso, Keith Idso, Sherwood Idso, Zbigniew Jaworowski, Wibjorn Karlen, William Kininmonth, Nigel Lawson, Douglas Leahey, David Legates, Richard Lindzen (useful response), Ross
Mckitrick (useful response), Patrick Michaels, Lubos Motl (useful response), Kary Mullis, Tad Murty, Tim Patterson, Benny Peiser (useful response), Ian Plimer, Arthur Robinson, Frederick Seitz, Nir Shaviv, Fred Smith, Willie Soon, Thomas Sowell, Roy Spencer, Philip Stott, Hendrik Tennekes, Jan Veizer, Peter Walsh, Edward Wegman
Other sources
Daniel Abbasi, Augie Auer, Bert Bolin, Jonathan Boston, Daniel Botkin (useful response), Reid Bryson, Robert Carter (useful response), Ralph Chapman, Al Gore, Kirtland C. Griffin (useful response), David Henderson, Christopher Landsea (useful response), Bjorn Lomborg, Tim Osborn, Roger Pielke (useful response), Henrik Saxe, Thomas Schelling (useful response), Matthew Sobel, Nicholas Stern (useful response), Brian Valentine (useful response), Carl Wunsch (useful response), Antonio Zichichi.
This comment was getting a bit long, so I decided to just post relevant stuff from Armstrong and Green first and then offer my own thoughts in a follow-up comment.
We surveyed scientists involved in long-term climate forecasting and policy makers. Our primary concern was to identify the most important forecasts and how those forecasts were made. In particular, we wished to know if the most widely accepted forecasts of global average temperature were based on the opinions of experts or were derived using scientific forecasting methods. Given the findings of our review of reviews of climate forecasting and the conclusion from our Google search that many scientists are unaware of evidence-based findings related to forecasting methods, we expected that the forecasts would be based on the opinions of scientists. We sent a questionnaire to experts who had expressed diverse opinions on global warming. We generated lists of experts by identifying key people and asking them to identify others. (The lists are provided in Appendix A.) Most (70%) of the 240 experts on our lists were IPCC reviewers and authors. Our questionnaire asked the experts to provide references for what they regarded as the most credible source of long-term forecasts of mean global temperatures. We strove for simplicity to minimize resistance to our request. Even busy people should have time to send a few references, especially if they believe that it is important to evaluate the quality of the forecasts that may influence major decisions. We asked: “We want to know which forecasts people regard as the most credible and how those forecasts were derived… In your opinion, which scientific article is the source of the most credible forecasts of global average temperatures over the rest of this century?” We received useful responses from 51 of the 240 experts, 42 of whom provided references to what they regarded as credible sources of long-term forecasts of mean global temperatures. Interestingly, eight respondents provided references in support of their claims that no credible forecasts exist. Of the 42 expert respondents who were associated with global warming views, 30 referred us to the IPCC’s report. A list of
the papers that were suggested by respondents is provided at publicpolicyforecasting.com in the “Global Warming” section.
Unfortunately, the Forecasting Principles website seems to be a mess. Their Global Warming Audit page:
does link to a bibliography, but the link is broken (as is their global warming audit link, though the file is still on their website).
(This is another example where experts in one field ignore best practices—of maintaining working links to their writing—so the insularity critique applies to forecasting experts).
Continuing:
Based on the replies to our survey, it was clear that the IPCC’s Working Group 1 Report contained the forecasts that are viewed as most credible by the bulk of the climate forecasting community. These forecasts are contained in Chapter 10 of the Report and the models that are used to forecast climate are assessed in Chapter 8, “Climate Models and Their Evaluation” (Randall et al. 2007). Chapter 8 provided the most useful information on the forecasting process used by the IPCC to derive forecasts of mean global temperatures, so we audited that chapter.
We also posted calls on email lists and on the forecastingprinciples.com site asking for help from those who might have any knowledge about scientific climate forecasts. This yielded few responses, only one of which provided relevant references.
Trenberth (2007) and others have claimed that the IPCC does not provide forecasts but rather presents “scenarios” or “projections.” As best as we can tell, these terms are used by the IPCC authors to indicate that they provide “conditional forecasts.” Presumably the IPCC authors hope that readers, especially policy makers, will find at
least one of their conditional forecast series plausible and will act as if it will come true if no action is taken. As it happens, the word “forecast” and its derivatives occurred 37 times, and “predict” and its derivatives occurred 90 times in the body of Chapter 8. Recall also that most of our respondents (29 of whom were IPCC authors or reviewers) nominated the IPCC report as the most credible source of forecasts (not “scenarios” or “projections”) of global average temperature. We conclude that the IPCC does provide forecasts.
In order to audit the forecasting processes described in Chapter 8 of the IPCC’s report, we each read it prior to any discussion. The chapter was, in our judgment, poorly written. The writing showed little concern for the target readership. It provided extensive detail on items that are of little interest in judging the merits of the forecasting process, provided references without describing what readers might find, and imposed an incredible burden on readers by providing 788 references. In addition, the Chapter reads in places like a sales brochure. In the three-page executive summary, the terms, “new” and “improved” and related derivatives appeared 17 times. Most significantly, the chapter omitted key details on the assumptions and the forecasting process that were used. If the authors used a formal structured procedure to assess the forecasting processes, this was not evident.
[...]
Reliability is an issue with rating tasks. For that reason, it is desirable to use two or more raters. We sent out general calls for experts to use the Forecasting Audit Software to conduct their own audits and we also asked a few individuals to do so. At the time of writing, none have done so.
On Critique #1:
Since you are using Real Climate and Skeptical Science as sources, did you read what they had to say about the Armstrong and Green paper and about Nate Silver’s chapter?
Gavin Schmidt’s post was short, funny but rude; however ChrisC’s comment looks much more damning if true. Is it true?
Here is Skeptical Science on Nate Silver. It seems the main cause of error in Hansen’s early 1988 forecast was an assumed climate sensitivity greater than that of the more recent models and calculations (4.2 degrees rather than 3 degrees). Whereas IPCC’s 1990 forecast had problems predicting the inputs to global warming (amount of emissions, or radiative forcing for given emissions) rather than the outputs (resulting warming). Redoing accounting for these factors removes nearly all the discrepancy.
In light of the portions I quoted from Armstrong and Green’s paper, I’ll look at Gavin Schmidt’s post:
The paper does cite many other sources than just the IPCC and the “hatchet job” on the Stern Report, including sources that evaluate climate models and their quality in general. ChrisC notes that the author’s fail to cite the ~788 references for the IPCC Chapter 8. The authors claim to have a bibliography on their website that includes the full list of references given to them by all academics who suggested references. Unfortunately, as I noted in my earlier comment, the link to the bibliography from http://www.forecastingprinciples.com/index.php?option=com_content&view=article&id=78&Itemid=107 is broken. This doesn’t reflect well on the authors (the site on the whole is a mess, with many broken links). Assuming, however, that the authors had put up the bibliography and that it was available as promised in the paper, this critique seems off the mark (though I’d have to see the bibliography to know for sure).
This seems patently false given the contents of the paper as I quoted it, and the list of experts that they sought. In fact, it seems like such a major error that I have no idea how Schmidt could have made it if he’d read the paper. (Perhaps he had a more nuanced critique to offer, e.g., that the authors’ survey didn’t ask enough questions, or they should have tried harder, or contacted more people. But the critique as offered here smacks of incompetence or malice). [Unless Schmidt was reading an older version of the paper that didn’t mention the survey at all. But I doubt that even if he was looking at an old version of the paper, it omitted all references to the survey.]
First off, retrospective “predictions” of things that people already tacitly know, even though those things aren’t explicitly used in tuning the models, are not that reliable.
Secondly, it’s possible (and likely) that Armstrong and Green missed some out-of-model tests and validations that had been performed in the climate science arena. While part of this can be laid at their feet, part of it also reflects poor documentation by climate scientists of exactly how they were going about their testing. I did read that IPCC AR4 chapter that Armstrong and Green did, and I found it quite unclear on the forecasting side of things (compared to other papers I’ve read that judge forecast skill, in weather and short-term climate forecasting, macroeconomic forecasting, and business forecasting). This is similar to the sloppy code problem.
Thirdly, the climate scentists whom Armstrong and Green attempted to engage could have been more engaging (not Gavin Schmidt’s fault; he wasn’t included in the list, and the response rate appears to have been low from mainstream scientists as well as skeptics, so it’s not just a problem of the climate science mainstream).
Overall, I’d like to know more details of the survey responses and Armstrong and Green’s methodology, and it would be good if they combined their proclaimed commitment to openness with actually having working links on their websites. But Schmidt’s critique doesn’t reflect too well on him, even if Armstrong and Green were wrong.
Now, to ChrisC’s comment:
Part of the authors’ criticism was that the climate science mainstream hadn’t paid enough attention to forecasting, or to formal evaluations of forecasting. So it’s natural that they didn’t find enough mainstream stuff to cite that was directly relevant to the questions at hand for them.
As for the Google search and Google Scholar search, these are standard tools for initiating an inquiry. I know, I’ve done it, and so has everybody else. It would be damning if the authors had relied only on such searches. But they surveyed climate scientists and worked their way through the IPCC Working Group Report. This may have been far short of full due diligence, but it isn’t anywhere near as sloppy as Gavin Schmidt and ChrisC make it sound.
Thanks for a comprehensive summary—that was helpful.
It seems that A&G contacted the working scientists to identify papers which (in the scientists’ view) contained the most credible climate forecasts. Not many responded, but 30 referred to the recent (at the time) IPCC WP1 report, which in turn referenced and attempted to summarize over 700 primary papers. There also appear to have been a bunch of other papers cited by the surveyed scientists, but the site has lost them. So we’re somewhat at a loss to decide which primary sources climate scientists find most credible/authoritative. (Which is a pity, because those would be worth rating, surely?)
However, A&G did their rating/scoring on the IPCC WP1, Chapter 8. But they didn’t contact the climate scientists to help with this rating (or they did, but none of them answered?) They didn’t attempt to dig into the 700 or so underlying primary papers, identify which of them contained climate forecasts, and/or had been identified by the scientists as containing the most credible forecasts and then rate those. Or even pick a random sample, and rate those? All that does sound just a tad superficial.
What I find really bizarre is their site’s conclusion that because IPCC got a low score by their preferred rating principles, then a “no change” forecast is superior, and more credible! That’s really strange, since “no change” has historically done much worse as a predictor than any of the IPCC models.
See the last sentence in my longer quote:
It’s not clear how much effort they put into this step, and whether e.g. they offered the Forecasting Audit Software for free to people they asked (if they were trying to sell the software, which they themselves created, that might have seemed bad).
My guess is that most of the climate scientists they contacted just labeled them mentally along with the numerous “cranks” they usually have to deal with, and didn’t bother engaging.
I also am skeptical of some aspects of Armstrong and Green’s exercise. But a first outside-view analysis that doesn’t receive much useful engagement from insiders can only go so far. What would have been interesting was if, after Armstrong and Green published their analysis and it was somewhat clear that their critique would receive attention, climate scientists had offered a clearer and more direct response to the specific criticisms, and perhaps even read up more about the forecasting principles and the evidence cited for them. I don’t think all climate scientists should have done so, I just think at least a few should have been interested enough to do it. Even something similar to Nate Silver’s response would have been nice. And maybe that did happen—if so, I’d like to see links. Schmidt’s response, on the other hand, seems downright careless and bad.
My focus here is the critique of insularity, not so much the effect it had on the factual conclusions. Basically, did climate scientists carefully consider forecasting principles (or statistical methods, or software engineering principles) then reject them? Had they never heard of the relevant principles? Did they hear about the principles, but dismiss them as unworthy of investigation? Armstrong and Green’s audit may have been sloppy (though perhaps a first pass shouldn’t be expected to be better than sloppy) but even if the audit itself wasn’t much use, did it raise questions or general directions of inquiry worthy of investigation (or a simple response pointing to past investigation)? Schmidt’s reaction seems evidence in favor of the dismissal hypothesis. And in the particular instance, maybe he was right, but it does seem to fit the general idea of insularity.
(Your quote is mangled, you probably have four spaces at the beginning which makes the rendering engine interpret it as a needing to be formatted like code, i.e. No linebreaks)
Thanks, fixed!
Actually, it’s somewhat unclear whether the IPCC scenarios did better than a “no change” model—it is certainly true over the short time period, but perhaps not over a longer time period where temperatures had moved in other directions.
Co-author Green wrote a paper later claiming that the IPCC models did not do better than the no change model when tested over a broader time period:
http://www.kestencgreen.com/gas-improvements.pdf
But it’s just a draft paper and I don’t know if the author ever plans to clean it up or have it published.
I would really like to see more calibrations and scorings of the models from a pure outside view approach over longer time periods.
Armstrong was (perhaps wrongly) confident enough of his views that he decided to make a public bet claiming that the No Change scenario would beat out the other scenario. The bet is described at:
http://www.theclimatebet.com/
Overall, I have high confidence in the view that models of climate informed by some knowledge of climate should beat the No Change model, though a lot depends on the details of how the competition is framed (Armstrong’s climate bet may have been rigged in favor of No Change). That said, it’s not clear how well climate models can do relative to simple time series forecasting approaches or simple (linear trend from radiative forcing + cyclic trend from ocean currents) type approaches. The number of independent out-of-sample validations does not seem to be enough and the predictive power of complex models relative to simple curve-fitting models seems to be low (probably negative). So, I think that arguments that say “our most complex, sophisticated models show X” should be treated with suspicion and should not necessarily be given more credence than arguments that rely on simple models and historical observations.
There are certainly periods when temperatures moved in a negative direction (1940s-1970s), but then the radiative forcings over those periods (combination of natural and anthropogenic) were also negative. So climate models would also predict declining temperatures, which indeed is what they do “retrodict”. A no-change model would be wrong for those periods as well.
Your most substantive point is that the complex models don’t seem to be much more accurate than a simple forcing model (e.g. calculate net forcings from solar and various pollutant types, multiply by best estimate of climate sensitivity, and add a bit of lag since the system takes time to reach equilibrium; set sensitivity and lags empirically). I think that’s true on the “broadest brush” level, but not for regional and temporal details e.g. warming at different latitudes, different seasons, land versus sea, northern versus southern hemisphere, day versus night, changes in maximum versus minimum temperatures, changes in temperature at different levels of the atmosphere etc. It’s hard to get those details right without a good physical model of the climate system and associated general circulation model (which is where the complexity arises). My understanding is that the GCMs do largely get these things right, and make predictions in line with observations; much better than simple trend-fitting.
P.S. If I draw one supportive conclusion from this discussion, it is that long-range climate forecasts are very likely to be wrong, simply because the inputs (radiative forcings) are impossible to forecast with any degree of accuracy.
Even if we’d had perfect GCMs in 1900, forecasts for the 20th century would likely have been very wrong: no one could have predicted the relative balance of CO2, other greenhouse gases and sulfates/aerosols (e.g. no-one could have guessed the pattern of sudden sulfates growth after the 1940s, followed by levelling off after the 1970s). And natural factors like solar cycles, volcanoes and El Niño/La Nina wouldn’t have been predictable either.
Similarly, changes in the 21st century could be very unexpected. Perhaps some new industrial process creates brand new pollutants with negative radiative forcing in the 2030s; but then the Amazon dies off in the 2040s, followed by a massive methane belch from the Arctic in the 2050s; then emergency geo-engineering goes into fashion in the 2070s (and out again in the 2080s); then in the 2090s there is a resurgence in coal, because the latest generation of solar panels has been discovered to be causing a weird new plague. Temperatures could be up and down like a yo-yo all century.
Here’s a full list of the scientists that Armstrong and Green contacted—the ones who sent a “useful response” are noted parenthetically. Note that of the 51 who responded, 42 were deemed as having given a useful response.
This comment was getting a bit long, so I decided to just post relevant stuff from Armstrong and Green first and then offer my own thoughts in a follow-up comment.
Unfortunately, the Forecasting Principles website seems to be a mess. Their Global Warming Audit page:
http://www.forecastingprinciples.com/index.php?option=com_content&view=article&id=78&Itemid=107
does link to a bibliography, but the link is broken (as is their global warming audit link, though the file is still on their website).
(This is another example where experts in one field ignore best practices—of maintaining working links to their writing—so the insularity critique applies to forecasting experts).
Continuing: