## Monday, September 9, 2013

### ## More on global temperature spectra and trends.

In my previous post, I looked at ARIMA models and their effect on uncertainty of temperature trend, using autocorrelation functions. I was somewhat surprised by the strength of apparent periodicity with a period of 3-4 years.

It is plausible to connect this with ENSO, so in this post I want to probe a bit with FFT's, removing ENSO and other forcings in the spirit of Foster and Rahmstorf. But first I'd like to say something about stat significance and what we should be looking for.

Statistical significance is to do with sampling. If you derive a statistic from a sample of a population, how much would it vary if you re-sampled. How this applies to time series statistics isn't always clear.

If you measure the average weight of people on a boat, and you are interested in whether that boat will float, then yo don't need statistical significance. You only need to measure accurately. But if you want to make inferences about a class of people on boats, then you do. And you need to define your population carefully, and decide to worry about variables like age and sex.

A trend is just a statistic;  is a measure of growth. You can quote a trend (for a fixed period) for something that is growing exponentially, not of course expecting it to be constant over other times. Trend is a weighted average, and the same issue of statistical significance applies. You can quote a trend for 20 Cen temperature anomaly without worrying about its properties as a sample, or that it was not very linear. But if you want to make inferences about climate, you do need to treat it as a sample, and form some model of randomness. That involves the study of residuals, or deviation from linearity, and so apparently non-random deviation from linearity also becomes important, because you have to distinguish the two.

The autocorrelation functions of the previous post illustrate (and help to solve) the issue. Short term lags form the basis of the stochastic models we use (ARIMA). But once we get beyond lag ten or so, there is behaviour that doesn't look like those models. Instead it is due to periodicities, and quadratic terms etc.

Now in that case, the (lag stochastic/secular) regions were not well separated, and it is likely that the secular effects are messing with the ARIMA models.

So, coming back to what "sampling" means here. It's a notion of, what if we could rerun the 20th cen? Of course, with GCM's we can. But would the secular effects repeat too? ENSO is a case in point. It has aspects that can be characterised - periods etc. But they aren't exactly predictable. Phase certainly isn't. Should they have some kind of stochastic representation?

We can defer that by using what we know of that instance of ENSO forcing (and volcanic and solar) to try to focus more on the ARIMA part of the acf's. I'll do that here by looking at the Foster and Rahmstorf residuals.

#### Power spectra

But first I'll revisit the previous analysis and its apparent periodicity. Here are the acf's:

And here are the corresponding power spectra, which are the discrete Fourier transforms of the acf's

You can see the prominent peak at about 0.22 yr-1 (45 months) which Is likely ENSO related.

#### Removal of forcings

Foster and Rahmstorf (2011) used known forcings (ENSO, volcanic and solar) to regress against temperature, and showed that the residuals seemed to have a much more even trend. I did my own blog version here. But for this exercise I used the F&R residuals described by Tamino.

First, here are the power spectra. The marked frequency at about 0.22 yr-1 (or its harmonic at 0.44 yr-1) is still often dominant, but its magnitude is reduced considerably:

And here are the acf's. The critical thing is that the oscillation near zero is reduced, and intrudes less on the peak at zero, which is smaller, and should be freer of secular influences.

Here are the acf's with models fitted. It is no longer the case that ARMA(1,1) is clearly better than AR(1), and both taper close to zero before the acf goes negative. For the satellite indices, there is little difference between the models. In all cases, the central peaks of the acf's are much less spread, indicating much tighter CIs.

Finally, here is the table of model fit results:

 Result HADCRUT4 GISS NOAA UAH MSU-RSS OLS trend β 1.7020 1.7092 1.7500 1.4086 1.5676 OLS s.e. β 0.0463 0.0651 0.0514 0.0665 0.0641 AR(1) trend β 1.6984 1.7084 1.7449 1.4161 1.5678 AR(1) s.e. β 0.0785 0.0936 0.0765 0.1101 0.0978 AR(1) Quen s.e. 0.0781 0.0933 0.0762 0.1096 0.0974 AR(1) ρ 0.4802 0.3445 0.3735 0.4613 0.3948 ARMA(1,1) trend β 1.6953 1.7025 1.7423 1.4188 1.5687 ARMA(1,1) s.e. β 0.0946 0.1132 0.0936 0.1189 0.1027 ARMA(1,1) Quen s.e. 0.0940 0.1123 0.0930 0.1182 0.1022 ARMA(1,1) ρ 0.6901 0.6433 0.6672 0.5696 0.4849 ARMA(1,1) MA coef -0.2744 -0.3378 -0.3447 -0.1353 -0.1058

As Foster and Rahmstorf found, the trends are not only quite large, but have much reduced standard errors, whatever model is chosen.

#### Conclusion

Subtracting out the regression contributions of the forcings (ENSO, volcanic and solar) does reduce the long-lag oscillatory contributions to the acf, though they are not eliminated. This enables a better separation between those and the stochastic effects.

1. Interesting follow-up, thanks.

> I'll do that here by looking at the Foster and Gregory residuals.

F+R?

1. Thanks, William, and for the suggestion, Yes, F&R (but Gregory is easier to spell).

2. Hi Nick,

Interesting work. First one gentle suggestion -- add a zero pad factor of four to your Fourier transforms. The signal content in the transform is more readable by humans not endowed with Fourier interpolation skills when you zero pad. ;-)

Secondly, it's not surprising that F&R (or any other smoothing algorithm) is going to reduce variance by subtracting some portion of it off. The question is by how much the trend that remains is affected by the smoothing algorithm. Reduced variance does not imply reduced absolute uncertainty.

Three related posts to look at are:

Another “reconstruction” of underlying temperatures from 1979-2012 by Troy Masters.

16 ^more years of global warming by KevinC.

Estimating the Underlying Trend in Recent Warming by SteveF.

I think the warning by Kevin C is germane here:

Update 21/02/2013: Troy Masters is doing some interesting analysis on the methods employed here and by Foster and Rahmstorf. On the basis of his results and my latest analysis I now think that the uncertainties presented here are significantly underestimated, and that the attribution of short term temperature trends is far from settled. There remains a lot of interesting work to be done on this subject.

1. Carrick,
I mainly wanted to show that the 3-4 yr cycle was real (I was surprised by the acf), and that it came down with F&R.

What I'm maunly trying to get at with F&R is separating stochastic and other variation. In a way they both contribute to trend uncertainty, but the forcings part can't properly be modelled by stochastics.

I've been trying to tease out whether AR(1) really is inadequate - it looks so, but the acf goes negative, which then makes it look more reasonable. However, that acf behaviour is probably due to the periodics etc. Actually, KC's warning is unclear, because I think he was already using ARMA(1,1). It may refer to something in the video, but I now can't see that. But yes, there remains a lot of interesting work to do.

3. Hi Nick,

Interesting work. I have done some simple comparisons of residual variability in the temperature trends for CRUT4 after removing (as best possible) ENSO+solar and volcanic effects. I used centered moving averages of different lengths (3, 5, 7, 9.... 61 months) to try to see where there is significant periodic influence (when the length of the moving average filter is near the frequency the variability drops sharply. It looks to me like there are a few significant periodic contributions: annual, ~21-23 months, and ~47-49 months, which seem to correspond to some of the peaks in your graphic above. On further looking, the ~21-23 month contribution is ~100% in the northern hemisphere temperatures, is more common in winter than summer, and looks like it may be due to the stability of the polar vortex (the negative part of the oscillation seems to happen when there is a sudden stratospheric warming and loss of vortex stability, and the positive part when the polar vortex is more stable than usual). The ~4 year oscillation seems to mainly a southern hemisphere contribution; I have no idea of the physical cause.

Anyway, applying your analysis separately to the northern and southern hemisphere data might be interesting. If you can grab the CRUT4 data by latitude and eliminate the tropics (look only above 30N and below 30S) nearly all the ENSO influence disappears, and the non-ENSO contributions may be clearer.

SteveF (Steve Fitzpatrick)

1. Thanks, Steve,
So far I haven't really been trying to identify and explain the periodic effects, but just to separate them from the stochastics to get a better estimate of trend uncertainty - or at least the part of uncertainty that can be attributed fairly clearly to randomness. But a hemisphere separation would be interesting - I'll try it. I'm currently looking at trying to separate them in the frequency domain - it may work better.

I think the periodicities you describe are indeed what I am seeing, and your suggested causes sound very interesting.

4. I do hope to see evidence of your work Steve F.

Tamino's original work didn't click with me until I saw Keven C, Icarus, and Kosaka & Xie show how well the SOI fluctuations lined up with the global temperature time series, and also how well it worked all the way back to 1880!

The SOI signal appears very close to a red noise source as it has strong regression to the mean and a mean that is stably near zero over the last 130 years. It may be straightforward to duplicate a random SOI-like signal by applying an Ornstein-Uhlenbeck process with a hopping rate and drag term. The SOI is unique in that its a linear combination of two other random processes, the Tahiti barometric pressure time series and the Darwin BP time series. If these two are not individually autocorrelated any more strongly then red noise, then the combination won't be either, as the two time series will get out of phase quickly.

I posted my more complete analysis here:
http://contextearth.com/2013/10/04/climate-variability-and-inferring-global-warming/

Thanks Nick for pointing out Tamino again. Hat's off to Tamino who has been pushing this idea of extracting the signal from the noise.

1. WHT, thanks again for the link - I think we're looking at similar things. I hadn't heard of Eureqa, but it looks very powerful. It may be just what I need.

5. WHT,

The ENSO is neither a red noise process nor an Ornstein–Uhlenbeck process; it is a pseudo-oscillation, and a causal, not random process. A strongly positive current ENSO state means that the ENSO state at a characteristic time in the future (corresponding roughly to the average "half-period" of the pseudo-oscillation) is most likely going to be negative (and vice-versa, for a strongly negative current ENSO state, of course).

WRT Kosaka & Xie: they fix the ocean temperature in the tropical Pacific, and (lo and behold) their model better tracks the real world data; but considering that lots of people have shown very strong correlation between tropical Pacific temperature variation and global temps and rainfall patterns, I don't see there is much of a surprise there; it would be a surprise if forcing the model's tropical Pacific to match the Earth didn't make the model much better match the Earth. What Kosaka & Xie do not account for is the effective removal of heat from the model world when they force the tropical Pacific to cool (on average) to match the real world. Yup, take a bunch of heat out of the system, and the model temperature does not rise so quickly... I'm underwhelmed.

SteveF

6. SteveF, I will meet you half-way and call it a semi-Markov process. It is not quite memory-less but the randomness is strong enough that there are no identifiable periods. In other words, it is highly dispersive yet retains strong reversion to the mean properties. Take an FFT of the SOI process and it has a weak hump of a semi-Markov process but that's about it for structure. Here is the FFT along with a power spectral profile of a semi-Markov process that has a significant dispersion about a mean value.

http://img822.imageshack.us/img822/4835/d6y6.gif

The lack of structure is not too surprising as the majority of ocean waves are also highly dispersive (except for such types as ripple waves and cnoidal waves) , and don't show identifiable periods. This dispersion is all too frequent in nature.

But all that really does not matter because the important point is that the SOI noise signal can be subtracted from the global temperature time series. This allows us to improve the SNR and estimate the AGW trend much more easily.

Tamino and all the smart dudes are very excited by what Kosaka & Xie are doing. But it's not exactly overwhelming either, as it is so easy to do the compensation if one just wants to work with the SOI time series !

And no more pause or hiatus, which is nice to be able to get rid of.

1. I'm rather agreeing with Steve here. The acf's show strong periodicity, which suggests "teleconnection" in the time domain - apparent correlation between well separated points. I agree that it can be subtracted as a fix, but that doesn't entirely help with the uncertainty estimate. For future climate, you could say that we'd do the same calc, but there is uncertainty associated with identifying the ENSO component.

7. Nick, If I am reading your autocorrelation charts correctly, you only ran it for data from 1980. Any strong periodicities you see are likely merely artifacts.

I spent the last year doing power spectral analysis on terrain features, and explored the whole range of Markov to semi-Markov disorder.
http://entroplet.com/context_select/navigate?category=fine_terrain
http://entroplet.com/ref/foundation/B-terrain_characterization.pdf

I want to warn you about chasing phantoms on the temperature time series. Any predictive power will be low because there really is not a lot of data to generate a quality spectrum from, and like I said, the dispersion is high.

The SOI is already detrended, so use that as a trial and see what you can find over the entire range.

1. Yes, only since 1980 - basically for the reason Tamino gave. If you go much further back, there is an obvious non-linear component. But I'm happy to look at longer periods when either the acf or power spectrum (my current focus) gives a clear separation between deterministic (approx) and stochastic deviation.

As I've said, I was really looking at periodicity as something that confounds the stochastic modelling. That is up to about 4 years period. So there are plenty of periods in 33 years.

2. This is an example of an autocorrelation of a semi-Markov series plotted in comparison to the SOI autocorrelation over the span of 130 years. The semi-Markov is completely stochastic, pulling random draws from a dispersive distribution, weakly centered about a mean period

http://img513.imageshack.us/img513/1846/7sg4.gif

The model above has a mean of about 5.5 years but lots of dispersive variance about that value.
This dispersion is high enough that it makes it difficult to predict the duration to the next peak or valley except perhaps probabilistically.