moyhu: Trends, breakpoints and derivatives

Wednesday, January 21, 2015

Trends, breakpoints and derivatives

This post is partly following a comment by Carrick on acceleration in time series. We talk a lot about trends, using them in effect as an estimate of derivative. They are a pretty crude estimate, and I have long thought we could do better. Acceleration is of course second derivative.

Carrick cited Savitzky-Golay filters. I hadn't paid these much attention, but I see the relevant feature here is something that I had been using for a long time. If you want a linear convolution filter to return a derivative, or second derivative etc, just include test equations applying to some basis of powers and solve for the coefficients.

I've been writing a post on this for a while, and it has grown long, so I'll split in two. The first will be mainly on the familiar linear trends - good and bad points. The second will be on more general derivatives, with application to global temperature series.

Trends

We spend a lot of time talking about linear trends, as a measure of rate of warming (or pause etc). I've made a gadget here to facilitate that, even though I think a lot of the talk is misguided. Sometimes, dogmatic folk insist that trends should not be calculated without some prior demonstration of linear behaviour.

A silly example of this came with Lord Donoughue, in the House of Lords, monstering a Minister and the MetOffice, with Doug Keenan pulling the strings. The question was a haggle over significant rise, with Keenan badgering the MO to calculate it his pet way, the MO saying (reasonably) that they don't talk much about trends, accusations of the MO using inappropriate models etc. It really isn't that hard. A trend is just a weighted sum of readings. It has the same status as any other kind of average, and it has uncertainty like that of the standard error of the mean.

Trend as derivative

But a time series trend β can be seen as just a weighted average of derivatives. To see this in integral form:
β=∫xy dx/∫x² dx
where x is from -x0 to x0. Integrating by parts:

β=∫W(x0,x)y'(x) dx where W=(x0²-x²)/∫x² dx
W is a (Welch) taper which is zero at the ends of the integration range. I'll be using it more. But while it damps high frequencies, with roll-off O(1/f²) in the frequency domain, the differentiation itself brings back a factor of f, so net effect is O(1/f).

In the next post I'm looking at the effect of better noise suppression.

Trend as Savitsky operator

In my version of the Savitsky process for time series (x), I take a filter W and ask for the polynomial P that satisfies constraints
∑_xP(x) W(x) xⁱ=c_i
where the order of P equals the number of constraints. This is a linear system in the coefficients of P. P(x) W(x) will be the operator. For a differentiation operator, c=(0,1). You can go to higher order with c=(0,1,0,0) etc. For second derivative, c=(0,0,2).

Symmetry helps. If W is symmetric (x is centered),
P(x)= x/∑x²W
and the trend coefficient of series y is Î² = ∑x*yW
/ ∑x²W
If W is the boxcar filter, value 1 on a range, this is the OLS regression formula.

OLS trend as minimum variance estimator

A useful property to remember about the ordinary mean is that it is the number which, when subtracted, minimises the sum of squares. There is a corresponding property for OLS trend. It is the operator which, of all those V_i satisfying
V_ix_i=1 (summation convention)
has minimum sum squares V_iV_i. That is just an orthogonality property. And since for any time series y, V_iy_i is the trend estimate Î², the variance of that estimate is (V_iV_i)* var(y). So of all eligible V, the OLS trend has minimum variance.

The good and bad

So trend is a minimum variance estimate of derivative, but with poor noise damping. I'll compare next with operators where W is a quadratic taper, coming down to 0 at the ends, so continuous (Welch window). As smoother, it thus gives high frequency roll-off O(1/f²). W² (Parzen window) then has continuous derivative, and roll-off O(1/f³).

So here is a plot of the spectra, tapers centered so the spectrum is pure imaginary. The OLS trend operator is colored red, and given a 10-year period (width).

Some points:

Each of the operators is linear for low frequencies, as differentiation requires. As the frequency (1/width) = 10 /Cen is approached, the response starts to taper. This is the effect of smoothing at higher frequencies. The smoother tapers have a later cut-off, because they are effectively narrower in the time domain.
Each operator then has some band-pass character. This will show in their behaviour. It is an inevitable consequence of combining a linear start with a hf roll-off.
You can see the 1/f roll-off at high frequencies, compared to the other operators. This is the bad feature of trend as a derivative. If you have decided on a cut-off, you want it to be observed. The operator is no longer differentiating properly (linear), and so is unhelpful at higher frequencies. It is best if it fades quickly.

In the next post I'll show the effect of the filters on temperature series, and discuss matters like acceleration and the identification of breakpoints.

8 comments:

Everett F SargentJanuary 21, 2015 at 11:18 PM
Nick,

The filter you mention is an example of a FIR filter, as such it 'should' have a finite response function.

http://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter#Treatment_of_first_and_last_points

The above suggested approach 'mirror image' is not a very good idea (at least for an end condition that is 'steep'), I usually try a double flip 'mirror image' but even there it becomes rather tricky (removing and adding a low order polynomial helps there)..

I've used FFT (removing a lower polynomial first, and forcing the polynomial through both end points), but this has issues at the lowest and transition band frequencies.

I've also used an IIR filter (two-pass Butterworth with zero padding, again removing a lower polynomial first, then adding it back afterwards, running at quad precision up to N = 80 pole count), this becomes really difficult for short time series as the filter literally rings forever (I'm 'still' working on that one).

For the IIR filter, I use a 9-point FD stencil for the 1st and 2nd derivatives.

Similarly, a FIR LOESS (n = 2 quadratic) 'can' be used (just don't do interpolation as the response is 'lumpy'), again I use a 9-point FD stencil for the 1st and 2nd derivatives.

AFAIK, all filters suffer from the end point problem inherent in finite time series (you only have half the information at the end as you do in the middle).

Don't have the necessary math skills, I always test with a very long white noise time series, chop the series up, and look at the various filter end effects.
ReplyDelete
Replies
CarrickJanuary 22, 2015 at 7:52 AM
Nick I should mention that the problem for derivative filters is slightly easier with climate (and many related signal types) than the general problem, because the spectrum already has a 1/f^nu, nu > 1, character to it.

Like you, I don't stick with a rectangular window for general data (I typically find rectangular windows work well enough for temperature data usually). Instead, I use window functions that include Welch, Hann, Blackman and modified Gaussian (this is a tunable window, which has advantages).

ReplyDelete
Replies
JCHJanuary 26, 2015 at 8:20 AM
I think I mean the re-trended PDO, which looks pretty much like the SAT until around 1980 to 1985, at which time it makes a prolonged excursion in another direction.
ReplyDelete
Replies
Greg GoodmanMarch 4, 2015 at 5:48 AM
"A useful property to remember about the ordinary mean is that it is the number which, when subtracted, minimises the sum of squares. There is a corresponding property for OLS trend. "

Isn't this the same as saying that OLS trend is identical to mean of dT/dt ?
ReplyDelete
Replies

Add comment

An interactive topic index for all Moyhu posts.
Latest Ice and Temperature data
Climate Data Portals
A gallery of Javascript-enhanced graphics
Temperature trend viewer
Google Maps and GHCN
WebGL map of past GHCN/SST station temperatures
WebGL map of GHCN/SST station temperature trends
HiRes NOAA OI SST with WebGL and Movie
Regional Hi-Res SST movies
WebGL Facility
TempLS Guide
More pages, and blog glossary

moyhu

Wednesday, January 21, 2015

Trends, breakpoints and derivatives

Trends, breakpoints and derivatives

Trends

Trend as derivative

Trend as Savitsky operator

OLS trend as minimum variance estimator

The good and bad

Next

8 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Blog Archive

Translate

Resources

About Me

moyhu

Wednesday, January 21, 2015

Trends, breakpoints and derivatives

Trends, breakpoints and derivatives

Trends

Trend as derivative

Trend as Savitsky operator

OLS trend as minimum variance estimator

The good and bad

Next

8 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Subscribe To

Blog Archive

Translate

Resources

About Me