Carrick cited Savitzky-Golay filters. I hadn't paid these much attention, but I see the relevant feature here is something that I had been using for a long time. If you want a linear convolution filter to return a derivative, or second derivative etc, just include test equations applying to some basis of powers and solve for the coefficients.
I've been writing a post on this for a while, and it has grown long, so I'll split in two. The first will be mainly on the familiar linear trends - good and bad points. The second will be on more general derivatives, with application to global temperature series.
TrendsWe spend a lot of time talking about linear trends, as a measure of rate of warming (or pause etc). I've made a gadget here to facilitate that, even though I think a lot of the talk is misguided. Sometimes, dogmatic folk insist that trends should not be calculated without some prior demonstration of linear behaviour.
A silly example of this came with Lord Donoughue, in the House of Lords, monstering a Minister and the MetOffice, with Doug Keenan pulling the strings. The question was a haggle over significant rise, with Keenan badgering the MO to calculate it his pet way, the MO saying (reasonably) that they don't talk much about trends, accusations of the MO using inappropriate models etc. It really isn't that hard. A trend is just a weighted sum of readings. It has the same status as any other kind of average, and it has uncertainty like that of the standard error of the mean.
Trend as derivativeBut a time series trend β can be seen as just a weighted average of derivatives. To see this in integral form:
β=∫xy dx/∫x² dx
where x is from -x0 to x0. Integrating by parts:
β=∫W(x0,x)y'(x) dx where W=(x0²-x²)/∫x² dx
W is a (Welch) taper which is zero at the ends of the integration range. I'll be using it more. But while it damps high frequencies, with roll-off O(1/f²) in the frequency domain, the differentiation itself brings back a factor of f, so net effect is O(1/f).
In the next post I'm looking at the effect of better noise suppression.
Trend as Savitsky operatorIn my version of the Savitsky process for time series (x), I take a filter W and ask for the polynomial P that satisfies constraints
∑xP(x) W(x) xi=ci
where the order of P equals the number of constraints. This is a linear system in the coefficients of P. P(x) W(x) will be the operator. For a differentiation operator, c=(0,1). You can go to higher order with c=(0,1,0,0) etc. For second derivative, c=(0,0,2).
Symmetry helps. If W is symmetric (x is centered),
and the trend coefficient of series y is Î² = ∑x*yW
If W is the boxcar filter, value 1 on a range, this is the OLS regression formula.
OLS trend as minimum variance estimatorA useful property to remember about the ordinary mean is that it is the number which, when subtracted, minimises the sum of squares. There is a corresponding property for OLS trend. It is the operator which, of all those Vi satisfying
Vixi=1 (summation convention)
has minimum sum squares ViVi. That is just an orthogonality property. And since for any time series y, Viyi is the trend estimate Î², the variance of that estimate is (ViVi)* var(y). So of all eligible V, the OLS trend has minimum variance.
The good and badSo trend is a minimum variance estimate of derivative, but with poor noise damping. I'll compare next with operators where W is a quadratic taper, coming down to 0 at the ends, so continuous (Welch window). As smoother, it thus gives high frequency roll-off O(1/f2). W2 (Parzen window) then has continuous derivative, and roll-off O(1/f3).
So here is a plot of the spectra, tapers centered so the spectrum is pure imaginary. The OLS trend operator is colored red, and given a 10-year period (width).
- Each of the operators is linear for low frequencies, as differentiation requires. As the frequency (1/width) = 10 /Cen is approached, the response starts to taper. This is the effect of smoothing at higher frequencies. The smoother tapers have a later cut-off, because they are effectively narrower in the time domain.
- Each operator then has some band-pass character. This will show in their behaviour. It is an inevitable consequence of combining a linear start with a hf roll-off.
- You can see the 1/f roll-off at high frequencies, compared to the other operators. This is the bad feature of trend as a derivative. If you have decided on a cut-off, you want it to be observed. The operator is no longer differentiating properly (linear), and so is unhelpful at higher frequencies. It is best if it fades quickly.