I don’t believe it is true that ‘temperatures have been basically flat’ for the last 15 years: I see a net gain of 0.1 to 0.2 Kelvin, depending on the data set (HadCRUT3 v. GISTEMP v. UAH v. RSS).
So the standard is “net gain,” and a net gain greater than (or less than) 0.1 Kelvin means not basically flat?
And it looks to me like temperatures have only been ‘flat’ for the last 10 years in the sense that a short enough snippet of a noisy time series will always look ‘flat.’
That may be true, but so what? characterization of evidence != interpretation of evidence. Agreed?
I think the house price boom & crash is a little too big to characterize like that.
Why not? It’s a short term detour in a larger overall trend. If you happened to buy a house at the top of the market, there is still an excellent chance that some day the market price will exceed your purchase price.
So the standard is “net gain,” and a net gain greater than (or less than) 0.1 Kelvin means not basically flat?
Any net gain (or net loss), however small, means not flat, if you are confident enough that it’s not an artefact or noise. (Adding the adverb ‘basically’ muddies things a bit, because it implies that you’re not interested in small deviations from flatness.) So: am I quite confident that there has been a deviation from flatness since 1995, and that the deviation is neither artefact nor noise? Yes. But you knew that already, so I’ll go deeper.
You earlier referred to the Phil Jones interview where he stated that the warming since 1995 is ‘only just’ statistically insignificant. I don’t know enough about testing autocorrelated time series to check that, but I’m willing to pretty much trust him on this point.
OK, so every so often on Less Wrong you see a snippet of Jaynes or a popular science article presented in the context of a frequentism vs. Bayesianism comparison. I’ve gone to bat before (see that first link’s discussion) to explain why pitting the two against each other seems wrong-minded to me. I’ve yet to see an example where frequentist methods necessarily have to give a different result to Bayesian methods, just by virtue of being frequentist rather than Bayesian. I see the two as two sides of the same coin.
Still, there are certain techniques that are more associated with the frequentist school than the Bayesian. One of them is statistical significance testing. That particular technique gets a lot of heat from statisticians of all sorts (not just Bayesians!), and arguably rightly so. People are liable to equate statistical significance with practical significance, which is simply wrong, and to dogmatically reject any null hypothesis that doesn’t clear a particular p-value bar. On this point, I have to agree with the critics. Far as I can tell, there are too many people who fundamentally misunderstand significance tests, and as someone who does understand them (or I think I do—maybe that’s just the Dunning-Kruger effect talking) and finds them useful, that disappoints me.
In the end, you have to exercise judgment in interpreting significance tests, like any other tool. Just because a test limps over the magic significance level with a p-value of 0.049 doesn’t mean you should immediately shitcan your null hypothesis, and just because your test falls a hair short with an 0.051 p-value doesn’t mean there’s nothing there.
To get more specific, that net warming since 1995 has been ‘just’ statistically insignificant does not mean no warming. It means that under a particular model, the null hypothesis of no overall trend cannot be rejected. It could be because there really is no trend. Or there might be a true trend, but your data are too noisy and too few. Or the test could be cherrypicked. You have to exercise judgment and decide which is most likely. I believe the last two possibilities are most likely: I can see the noise with my own eyes, and apparently 1995 is the earliest year where warming since that year is statistically insignificant, which would be consistent with cherry-picking the year 1995.
Which is why I reject the null hypothesis of no net temperature change since 1995, even though the p-value of Phil Jones’ test is presumably a bit higher than 0.05.
That may be true, but so what? characterization of evidence != interpretation of evidence. Agreed?
They are distinct concepts.
I get the feeling that you think calling the last decade of temperatures ‘flat’ is characterization and not interpretation, and I would disagree. When I say temperatures have risen overall, that’s an interpretation. When you say they have not, that’s an interpretation. Either interpretation is defensible, though I believe mine is more accurate (but of course I would believe that).
Why not? It’s a short term detour in a larger overall trend.
Right, but if you compare the housing price detour to the noise in the house price data, it’s relatively way way bigger than the El Niño deviation compared to the noise in the temperature data.
I pulled the temperature data behind this plot and regressed temperature on year. Then I calculated the standard deviation of the residuals from the start of the time series up to 1998 (when the EN kicked in). The peak in the data (at ‘year’ 1998.08, with a value of 0.6595 degrees) is then 3.9 sigmas above the regression line.
Look back at the home price graph—maybe that particular graph’s been massively smoothed, but the post-peak drop looks like way more than a 4 sigma decline: I’d eyeball it as on the order of 10-20 sigmas—and that’s a big underestimate because the standard deviation is going to be inflated by what looks like a seasonal fluctuation (the yearly-looking spikes). The El Niño is big and bold, no doubt about it, but it’s a puppy compared to the housing pricing crash.
Adding the adverb ‘basically’ muddies things a bit, because it implies that you’re not interested in small deviations from flatness.
Of course it muddies things and we should not be interested in small deviations. That’s the basic point of your argument. The only question is how small is small.
When I say temperatures have risen overall, that’s an interpretation. When you say they have not, that’s an interpretation
Well can you give me an example of a statement about temperature in the last 10 years which is not an “interpretation”?
Right, but if you compare the housing price detour to the noise in the house price data, it’s relatively way way bigger than the El Niño deviation compared to the noise in the temperature data.
The El Niño is big and bold, no doubt about it, but it’s a puppy compared to the housing pricing crash.
So what? In 1998, would it have been wrong to say that global surface temperatures had risen (relatively) rapidly over the previous few years?
Of course it muddies things and we should not be interested in small deviations. That’s the basic point of your argument.
?!
The point I was making in the first 550 words of the grandparent comment is that one shouldn’t automatically disregard a small deviation from flatness merely because it’s (barely) statistically insignificant. I am not sure how you interpreted it to mean that ‘we should not be interested in small deviations.’
Well can you give me an example of a statement about temperature in the last 10 years which is not an “interpretation”?
A statement that’s a few written words or sentences? I doubt it. Trying to summarize a complicated time series in a few words is inevitably going to mean not mentioning some features of the time series, and your editorial judgment of which features not to mention means you’re interpreting it.
So what?
You should know, you asked me ‘Why not?’ in the first place.
In 1998, would it have been wrong to say that global surface temperatures had risen (relatively) rapidly over the previous few years?
Practically, yes, because that claim carries the implication that the El Niño spike is representative of the warming ‘over the previous few years.’
So the standard is “net gain,” and a net gain greater than (or less than) 0.1 Kelvin means not basically flat?
My linear regressions based on NOAA data (I was stupid and lost the citation for where I downloaded it) have 0.005-0.007 K/year since 1880; 0.1 to 0.2 K in a decade is beating the trend.
I took the liberty of downloading the GISTEMP data, which I suspect are very similar to the NOAA data (because the GISTEMP series also starts at 1880, and I dimly remember reading somewhere that the GISS gets land-based temperature data from the NOAA). Regressing anomaly on year I get an 0.00577 K/year increase since 1880, consistent with Robin’s estimate. R tells me the standard error on that estimate is 0.00011 K/year.
However, that standard error estimate should be taken with a pinch of salt for two reasons: the regression’s residuals are correlated, and it is unlikely that a linear model is wholly appropriate because global warming was reduced mid-century by sulphate emissions. Caveat calculator!
(ETA: I just noticed you wrote ‘these,’ so I thought you might be interested in the trend for the past decade as well. Regressing anomaly on year for the past 120 monthly GISTEMP temperature anomalies has a trend of 0.0167 ± 0.0023 K/year, but the same warning about that standard error applies.)
I have no idea. Varying the starting point from ten to thirty years ago with Feb 2010 as the endpoint puts the slope anywhere in the range [-0.0001,0.2], so it must be fairly large on the scale of a decade.
So the standard is “net gain,” and a net gain greater than (or less than) 0.1 Kelvin means not basically flat?
That may be true, but so what? characterization of evidence != interpretation of evidence. Agreed?
Why not? It’s a short term detour in a larger overall trend. If you happened to buy a house at the top of the market, there is still an excellent chance that some day the market price will exceed your purchase price.
Any net gain (or net loss), however small, means not flat, if you are confident enough that it’s not an artefact or noise. (Adding the adverb ‘basically’ muddies things a bit, because it implies that you’re not interested in small deviations from flatness.) So: am I quite confident that there has been a deviation from flatness since 1995, and that the deviation is neither artefact nor noise? Yes. But you knew that already, so I’ll go deeper.
You earlier referred to the Phil Jones interview where he stated that the warming since 1995 is ‘only just’ statistically insignificant. I don’t know enough about testing autocorrelated time series to check that, but I’m willing to pretty much trust him on this point.
OK, so every so often on Less Wrong you see a snippet of Jaynes or a popular science article presented in the context of a frequentism vs. Bayesianism comparison. I’ve gone to bat before (see that first link’s discussion) to explain why pitting the two against each other seems wrong-minded to me. I’ve yet to see an example where frequentist methods necessarily have to give a different result to Bayesian methods, just by virtue of being frequentist rather than Bayesian. I see the two as two sides of the same coin.
Still, there are certain techniques that are more associated with the frequentist school than the Bayesian. One of them is statistical significance testing. That particular technique gets a lot of heat from statisticians of all sorts (not just Bayesians!), and arguably rightly so. People are liable to equate statistical significance with practical significance, which is simply wrong, and to dogmatically reject any null hypothesis that doesn’t clear a particular p-value bar. On this point, I have to agree with the critics. Far as I can tell, there are too many people who fundamentally misunderstand significance tests, and as someone who does understand them (or I think I do—maybe that’s just the Dunning-Kruger effect talking) and finds them useful, that disappoints me.
In the end, you have to exercise judgment in interpreting significance tests, like any other tool. Just because a test limps over the magic significance level with a p-value of 0.049 doesn’t mean you should immediately shitcan your null hypothesis, and just because your test falls a hair short with an 0.051 p-value doesn’t mean there’s nothing there.
To get more specific, that net warming since 1995 has been ‘just’ statistically insignificant does not mean no warming. It means that under a particular model, the null hypothesis of no overall trend cannot be rejected. It could be because there really is no trend. Or there might be a true trend, but your data are too noisy and too few. Or the test could be cherrypicked. You have to exercise judgment and decide which is most likely. I believe the last two possibilities are most likely: I can see the noise with my own eyes, and apparently 1995 is the earliest year where warming since that year is statistically insignificant, which would be consistent with cherry-picking the year 1995.
Which is why I reject the null hypothesis of no net temperature change since 1995, even though the p-value of Phil Jones’ test is presumably a bit higher than 0.05.
They are distinct concepts.
I get the feeling that you think calling the last decade of temperatures ‘flat’ is characterization and not interpretation, and I would disagree. When I say temperatures have risen overall, that’s an interpretation. When you say they have not, that’s an interpretation. Either interpretation is defensible, though I believe mine is more accurate (but of course I would believe that).
Right, but if you compare the housing price detour to the noise in the house price data, it’s relatively way way bigger than the El Niño deviation compared to the noise in the temperature data.
I pulled the temperature data behind this plot and regressed temperature on year. Then I calculated the standard deviation of the residuals from the start of the time series up to 1998 (when the EN kicked in). The peak in the data (at ‘year’ 1998.08, with a value of 0.6595 degrees) is then 3.9 sigmas above the regression line.
Look back at the home price graph—maybe that particular graph’s been massively smoothed, but the post-peak drop looks like way more than a 4 sigma decline: I’d eyeball it as on the order of 10-20 sigmas—and that’s a big underestimate because the standard deviation is going to be inflated by what looks like a seasonal fluctuation (the yearly-looking spikes). The El Niño is big and bold, no doubt about it, but it’s a puppy compared to the housing pricing crash.
Of course it muddies things and we should not be interested in small deviations. That’s the basic point of your argument. The only question is how small is small.
Well can you give me an example of a statement about temperature in the last 10 years which is not an “interpretation”?
So what? In 1998, would it have been wrong to say that global surface temperatures had risen (relatively) rapidly over the previous few years?
?!
The point I was making in the first 550 words of the grandparent comment is that one shouldn’t automatically disregard a small deviation from flatness merely because it’s (barely) statistically insignificant. I am not sure how you interpreted it to mean that ‘we should not be interested in small deviations.’
A statement that’s a few written words or sentences? I doubt it. Trying to summarize a complicated time series in a few words is inevitably going to mean not mentioning some features of the time series, and your editorial judgment of which features not to mention means you’re interpreting it.
You should know, you asked me ‘Why not?’ in the first place.
Practically, yes, because that claim carries the implication that the El Niño spike is representative of the warming ‘over the previous few years.’
My linear regressions based on NOAA data (I was stupid and lost the citation for where I downloaded it) have 0.005-0.007 K/year since 1880; 0.1 to 0.2 K in a decade is beating the trend.
What are the uncertainties on each of these?
I took the liberty of downloading the GISTEMP data, which I suspect are very similar to the NOAA data (because the GISTEMP series also starts at 1880, and I dimly remember reading somewhere that the GISS gets land-based temperature data from the NOAA). Regressing anomaly on year I get an 0.00577 K/year increase since 1880, consistent with Robin’s estimate. R tells me the standard error on that estimate is 0.00011 K/year.
However, that standard error estimate should be taken with a pinch of salt for two reasons: the regression’s residuals are correlated, and it is unlikely that a linear model is wholly appropriate because global warming was reduced mid-century by sulphate emissions. Caveat calculator!
(ETA: I just noticed you wrote ‘these,’ so I thought you might be interested in the trend for the past decade as well. Regressing anomaly on year for the past 120 monthly GISTEMP temperature anomalies has a trend of 0.0167 ± 0.0023 K/year, but the same warning about that standard error applies.)
I have no idea. Varying the starting point from ten to thirty years ago with Feb 2010 as the endpoint puts the slope anywhere in the range [-0.0001,0.2], so it must be fairly large on the scale of a decade.
Your regression package doesn’t report uncertainties? (Ideally this would be in the form of a covariance matrix.)
My regression package is a tab-deliminated data file, a copy of MATLAB, and least-squares.