moyhu: GWPF's "statistical forecast"

Tuesday, March 1, 2016

GWPF's "statistical forecast"

There was a flurry of articles in the usual Press places, about a statistical forecast of future temperatures sponsored by the GWPF (Murdoch, Express,Daily Mail). Blog reaction tended to focus on the total ignoring of physics, or, more frequently, the observation that:
"The GWPF paid Terence Mills, professor of applied statistics at Loughborough University, £3,000 to write the report."

Well, those are cogent criticisms. But time series analysis is a respectable enough endeavour, so I read the report to see if the GWPF got its money's worth.

The answer is no, or maybe from their poiunt of view, yes. There are about 33 quite well-written pages, borrowing I suspect from lecture notes (nothing wrong with that). But the forecasts themselves are worthless. Gavin Schmidt tweeted the following test of the HADCRUT forecasts, which were from 2014, after just one year:

The thin blue line is 2015 information added by Gavin, and as you see, they are already outside the confidence intervals.

Now it might be said that briefly going beyond the CI's is note a complete refutation, though it very likely won't be brief. The real reason that the forecasts are worthless is that in every case they simply foretell that the expectation for the future is no change at all. The forecast is just constant after a few months, based on a very recent weighted average of data points. It has to be - that is built into the models. In fact, the author actually says this:
"The central aim of this report is to emphasise that, while statistical forecasting appears highly applicable to climate data, the choice of which stochastic model to fit to an observed time series largely determines the properties of forecasts of future observations and of measures of the associated forecast uncertainty, particularly as the forecast horizon increases."
It would be good if he had emphasised that more; it's true (my bold). But in fact what is emphasised is the forecast. And in the press, the slow growth of the forecast. But, as said, that is built in. His models are simply incapable of forecasting change. I'll show why.

You can see from the time series plot of HADCRUT 4:

that especially over the last fifty years, forecasting on the basis of nothing changing would have been very unsuccessful. And there is no reason to expect it to be better in the future. So what gives?

Mills does two HADCRUT forecasts, as shown in Gavin's annotated plot above. One is a fit to the whole period, and one is a piecewise fit, in segments of which the last starts in 2002 (diagram below). For the whole period fit, he chooses a ARIMA(0,1,3) model:

The a_t terms are iid random variables with the stated σ. I see no benefit in the extra complexity, but that is what he did. Now the key is the words "Omitting this from the model...". The constant term (drift) 0.0005 is the term that provides the trend. It is about 0.6 °C/Century - about what most people find for trend since 1850. And if you use just about any linear regression model, it will say that the trend is highly significant. I got a t-value, with AR(1), of 51. Yet he says it is not significant, and sets it to zero. And it is that setting to zero which guarantees a zero trend of the prediction. If he had left it at 0.6 °C/century, that would have been the predicted future trend. He could have set it to 1.2; that is at least as likely as zero.

So why does he get such little significance? The reason is that he differences, and then allows a different distribution (ARIMA) for the residuals. When summed again, the errors are now modelled as a random walk (with ARIMA modelled steps), as he explains. And it is far easier for a random walk to emulate a trend. His model has far less ability to discriminate trend value.

There are two things badly wrong with this:

A random walk is unbounded, and is unphysical. Over time, temperatures can go anywhere, and will with eventual probability 1. They will surely go below 0K, and above 100°C. That hasn't happened. Now some may say, well, maybe the random walk is only for a finite period. But that is no use for prediction, if you don't know when the period is. And anyway it is highly implausible that the mechanism would suddenly change.
A random walk is a solution to a stochastic de. The future is determined principally by the current state. But there is no reason to think that random fluctuations of weather would behave that way. A heat wave doesn't establish a new base point. We know that there is physics that brings it back to equilibrium (eg S-B).

But what is then very bad is that, by omitting the constant, he treats it as zero with no uncertainty. So it goes into the prediction as certainly zero, even though we know it was positive.

Also bad is that he uses one model for the whole range to predict from 2014. But clearly things have changed since 1850. To remedy this, perhaps, Mills also offers a segmented model:

This time he uses an autoregressive model, AR(4), for the residuals of a linear regression. So at least the residuals aren't modelled as a random walk. He uses the final segment 2002-2014 for the prediction, and says of it that the slope is 0.8°C/Cen, and the t-value relative to 0 is 1.26. So again, on this basis he omits the trend term. This again has the effect of treating it as certainly zero. But because of the short interval, the trend was both positive (best estimate) and very uncertain. If he had made it the observed 0.8, this slope would have carried through to the prediction, and within the 95% limits it could have been as high as 2.0 &C/Century, which is about what GCMs predict for the next few decades.

Conclusion

I come back to what he said:
"the choice of which stochastic model to fit to an observed time series largely determines the properties of forecasts"
It does. And the trendless future is entirely determined by his decision to take the observed positive slopes and replace them with zero, with no uncertainty. This arbitrary and baseless decision is the entire basis for his no-growth forecasts. The forecast method itself is primitive - it simply projects to the future from a weighted average of a few of the most recent data points.

Update. I've been able to emulate the HADCRUT forecast process. The ARIMA forecasts, with no "omissions", are virtually identical to what you would get by projecting a straight line OLS regression slope over the same interval, shifted to start from some smoothed value of the endpoint. The only "improvement" here is that that linear extrapolation has its slope set arbitrarily to zero. I hope to blog the calculations tomorrow.

37 comments:

WindchasersMarch 1, 2016 at 12:09 PM
TL;DR: "If we remove the past trend from our statistical analysis, the resulting forecast is flat!"
ReplyDelete
Replies
It doesn't add up...March 1, 2016 at 1:54 PM
I see you are no longer making some of the accusations you levelled at Prof Mills at WUWT. However, you completely mistake the purpose of his work, which is to explore the statistical behaviour of the HADCRUT and RSS data over their respective histories (up to end 2014). There is no attempt at a physical model at all - any more than there is in oil blending equations that have been developed empirically - e.g. http://www.lube-media.com/documents/contribute/Lube-Tech093-ViscosityBlendingEquations.pdf - any more than your linear model is based on anything physical. Indeed, back projecting your linear model at 0.6 degrees per century implies temperatures below absolute zero less than 50,000 years ago.

Using the gretl package, I find a linear model on the data from Woodfortrees for HADCRUT4

Model 10: OLS, using observations 1850:01-2014:12 (T = 1980)
Dependent variable: HADCRUT4

coefficient std. error t-ratio p-value
---------------------------------------------------------
const −0.507785 0.00891402 −56.96 0.0000 ***
time 0.000399381 7.79480e-06 51.24 0.0000 ***

Mean dependent var −0.112198 S.D. dependent var 0.302357
Sum squared resid 77.74100 S.E. of regression 0.198249
R-squared 0.570300 Adjusted R-squared 0.570083
F(1, 1978) 2625.213 P-value(F) 0.000000
Log-likelihood 395.5964 Akaike criterion −787.1928
Schwarz criterion −776.0111 Hannan-Quinn −783.0850
rho 0.753234 Durbin-Watson 0.493833

where the time increment is 1 per month - it suffers from serious levels of autocorrelation, as indicated by the DW value, and the standard error is over half the inter quartile range of the data. The high t values for the regression coefficients are thus spurious.

For comparison, the ARIMA(0,1,3) model produces

Model 7: ARIMA, using observations 1850:02-2014:12 (T = 1979)
Estimated using Kalman filter (exact ML)
Dependent variable: (1-L) HADCRUT4
Standard errors based on Hessian

coefficient std. error z p-value
------------------------------------------------------------
const 0.000534454 0.000783477 0.6822 0.4951
theta_1 −0.517842 0.0225032 −23.01 3.54e-117 ***
theta_2 −0.0802394 0.0259927 −3.087 0.0020 ***
theta_3 −0.119860 0.0232144 −5.163 2.43e-07 ***

Mean dependent var 0.000673 S.D. dependent var 0.139316
Mean of innovations 0.000238 S.D. of innovations 0.123333

Covariance matrix of regression coefficients:

const theta_1 theta_2 theta_3
6.13837e-07 5.53465e-08 2.39223e-08 1.18355e-07 const
5.06394e-04 -2.86922e-04 -1.95562e-05 theta_1
6.75618e-04 -2.29751e-04 theta_2
5.38909e-04 theta_3
Removing the constant produces

Model 12: ARIMA, using observations 1850:02-2014:12 (T = 1979)
Estimated using Kalman filter (exact ML)
Dependent variable: (1-L) HADCRUT4
Standard errors based on Hessian

coefficient std. error z p-value
---------------------------------------------------------
theta_1 −0.517562 0.0224786 −23.02 2.64e-117 ***
theta_2 −0.0800007 0.0258845 −3.091 0.0020 ***
theta_3 −0.119490 0.0231740 −5.156 2.52e-07 ***

Mean dependent var 0.000673 S.D. dependent var 0.139316
Mean of innovations 0.002123 S.D. of innovations 0.123348

These models both account better for the variations in the data, and are thus superior descriptors of it. The exclusion of the constant trend term is not to be decried - it is an act of parsimony which has no substantive effect on the model error. Many climate models omit variables that have a clear physical basis for influencing climate because there are difficulties in calibrating them with the rest of the model framework: they add nothing to the framework.
ReplyDelete
Replies
It doesn't add up...March 1, 2016 at 1:56 PM
Similar considerations apply to the model broken into sub periods. Of course, most of the sub periods have significant trend constants - only the most recent period shows no trend. Similar conclusions can be reached by examining an 8 year centred moving average.

The forecast periods presented are modest - only out to 2020, and presume no change in behaviour. However, the segmented analysis demonstrates that there are turning points. The fact that we already have a couple of subsequent data points that lie outside the band within which 95% of observations are expected to lie is not a useful criticism of the forecast - at least until there is a much more data in that region. After all, the linear model has rather more points outside its confidence bands already.
ReplyDelete
Replies
Ned WMarch 1, 2016 at 10:45 PM
I have to express a certain grudging admiration for IDAU's willingness to vigorously defend a "forecast" that is simply laughable.
ReplyDelete
Replies
@whutMarch 2, 2016 at 12:26 PM
The more I research these climate indices, the more
I think they can be analyzed as deterministic rather
than statistical time-series. That would be a great
advance for no other reason than to get the
uncertainty monkey off our backs.

I have been harping on the QBO index, which is a
highly deterministic signal, in spite of what
has been written on it.

http://imageshack.com/a/img922/2119/ea0m2v.gif

Lindzen must have studied the QBO for 50 years
without finding the synchronous forcing mechanism
that drives it. The fact that he spent that
long should afford us to spend a few months talking
it up and getting others excited about it. And that
should promote interest in ENSO which I am also
convinced has a similar synchronous forcing
mechanism, albeit obscured by other noisy factors.

If that happens, these forecasts will become much
more solid, and everything will add up .. much
to the chagrin of those that don't want it to add up.

ReplyDelete
Replies
KeriMarch 2, 2016 at 6:39 PM
Incredible forecast. It looks like an extrapolation from a flat trend... This is nonsense. Here is a forecast from Met Office : http://www.metoffice.gov.uk/news/releases/archive/2016/decadal-forecast.
ReplyDelete
Replies
It doesn't add up...March 3, 2016 at 2:00 AM
ehak:

Isn't it a little early to tell? The MO forecast was only published on 1 February, and benefits from using a further year of data, and in any case is only projecting moving annual average temperature anomalies.
ReplyDelete
Replies
Carl GreeffMarch 3, 2016 at 4:48 AM
Arguing the fine points of statistics with someone whose personal Bayesian prior is independent of whether energy is conserved is unlikely to be fruitful.
ReplyDelete
Replies
LayzejMarch 3, 2016 at 7:04 AM
Mills wouldn't put money on his forecast. Would you? http://julesandjames.blogspot.com/2016/02/no-terence-mills-does-not-believe-his.html
ReplyDelete
Replies
@whutMarch 3, 2016 at 12:38 PM
Wagering on your science or promising prize money to refute it is a tell-tale sign of a crackpot.

One should occasionally revisit John Carlos Baez's Crackpot Index,
especially if you are going to crow about some achievement:
http://math.ucr.edu/home/baez/crackpot.html

I think James Annan made a mistake in offering a bet.

ReplyDelete
Replies

Add comment

An interactive topic index for all Moyhu posts.
Latest Ice and Temperature data
Climate Data Portals
A gallery of Javascript-enhanced graphics
Temperature trend viewer
Google Maps and GHCN
WebGL map of past GHCN/SST station temperatures
WebGL map of GHCN/SST station temperature trends
HiRes NOAA OI SST with WebGL and Movie
Regional Hi-Res SST movies
WebGL Facility
TempLS Guide
More pages, and blog glossary

moyhu

Tuesday, March 1, 2016

GWPF's "statistical forecast"

GWPF's "statistical forecast"

Conclusion

37 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Blog Archive

Translate

Resources

About Me

moyhu

Tuesday, March 1, 2016

GWPF's "statistical forecast"

GWPF's "statistical forecast"

Conclusion

37 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Subscribe To

Blog Archive

Translate

Resources

About Me