moyhu: Detecting periodicity

Thursday, May 29, 2014

Detecting periodicity

I've been following, and occasionally joining in, a series of threads by Willis Eschenbach at WUWT. Here is the latest, and you can follow links back.

Willis is trying to detect periodicity in various segments (century or so) of monthly or annual data. Lately he has been trying to decide whether sunspot (SSN) periodicities can be observed in climate data. This involves first establishing the major periodicities in SSN, and then looking at periodicities in signals which might be responding.

In the early part, Willis was fitting sinusoids using a least squares fit by optimisation. In this comment, I showed that the quadratic form could be expressed as a fairly orthodox discrete Fourier Transform expression, and wrote code to show how the same result could be derived. Willis then recharacterised what he was doing as a Slow Fourier Transform; here I try to put that in context.

I was not, BTW trying to suggest that what Willis was doing was wrong; only to say that it could be done more efficiently by FFT methods (which involved padding), and there was a long history. In fact someone characterised it as a Lomb periodogram, and it is certainly of a class of Least-squares spectral analysis (LSSA).

In the most recent version, he is using regression of the data onto sinusoids, and investigating the effect of using monthly or annual time divisions. Tamino chimed in in a positive spirit (the thread was called "Well, Color Me Gobsmacked!"), and said the method was known as the Date-Compensated Discrete Fourier Transform.

I think Willis now regards the chief advantage of the method as allowing for irregularly spaced data, in his case missing months, and that is what the DCDFT is for.

The method is sometimes being criticised as not being a truly orthogonal decomposition. I think that is misplaced. You can look for periodicities in a Fourier decomposition, but it is not the only way. I think Willis is basically filtering. He creates the equivalent of a perfectly tuned circuit, which takes the data as input, and identifies the amplitude of the response as the measure of periodicity. I think that is fine, and so I'd like in this post to say how one might assess and maybe improve the merit of that measure.

It's like what you did with the old AM radio receiver. By frequency shifting with a knob, you listened to the sound level to find stations.

What I want to do in this post is to show how you can get past all the issues of missing data and orthogonality, and rephrase in terms of Fourier integrals, which exposes the sources of error. I'm expanding on this comment.

Willis' method as a filter

For any given frequency ω, Willis uses the R lm() function to fit a+b*cos(ωt)+c*sin(ωt). He then returns √(b²+c²) as the amplitude of the fitted sinusoid. He usually plots this against period, not frequency, and looks for peaks.

This is a linear filter on the input. Normally we think of a filter as modifying one time series to another. This is a perfectly tuned filter, in that it produces a pure sinusoid as output. That is rather boring, in that it can be characterised by two numbers, amplitude and phase, and Willis drops the phase. But it is still a filter. You can think of it as a tuned circuit (or bell or cavity etc) which responds to frequencies near its resonance.

Expression as an integral

I'll assume that the data represents samples from a continuous function f(t). This assumption is basic to any continuum analysis. You have to assume that the data you have is representative of the values in between. f(t) could, for example, be a linear interpolate.

The sum Willis is evaluating in lm() can then be expressed as an integral:
∫ Ψ(t) cos(ωt+θ) f(t) dt
ω is angular frequency, θ phase. All integrals here are from -∞ to ∞, but Ψ(t) is sum of a finite number of delta functions. I've chosen Ψ to suggest a shah (Ш) function, also called a Dirac comb of delta functions. Generally it will here be mostly equally spaced, but that is not a requirement. It is truncated.

So now we can see what kind of filter it is, by looking at the Fourier Transform. The FT of cos is just a paired delta function that shifts the spectrum of Ψ(t) to center on ±ω. So if Ψ has, say, N=200 delta functions with s=1 annual spacing, the Fourier transform (t origin at the middle) is
sinc(Nω)/sinc(sω)
where sinc(x)=sin(x)/x

It looks like this in the frequency domain:

and the amplitude looks like this:

It is symmetric about 0. When combined with the sinusoid, it is displaced. Willis is looking for periods around 10 years; using that as a test frequency gives:

I think this shows a number of things. There is a peak at 1/10 yr^-1. It has side lobes, which represent the effect of the finite (200 yr) interval. These affect the ability to discriminate peaks. It is actually periodic overall, with period 0.5 yr^-1 (the FT alternates in sign and has period 1 yr^-1). That shows problems when you get up to the Nyquist frequency. But the peaks are fairly sharp, and the Nyquist troubles far away, at least if you are looking at periods of 10 years or so.

Irregular points

The sinc function ratio is actually the sum of sinusoids evaluated at the sample points. You can evaluate at a set less regularly spaced, with missing values say, and the function won't be radically different. And of course, the method still works.

Other ideas

In recent discussion, I've put some other analogies. The method is actually like the way analogue spectral analysis was done. Tuned circuits, acting as filters, responded to the signals in a tuned way. If you aren't familiar with circuits, think of a bank of lossless tuning forks, mounted on a sounding board, say. You buzz the burst of signal, and see which ones pick up the most energy.

The mathematics of the regression Willis uses is in fact the math of heterodyning. You multiply by a sinusoid and add. If the signal can reasonably be fitted with a nearby sinusoid, then the trig formula for products will yiekd a beat frequency, lower in pitch as the two frequencies are closer. The addition is a low pass filter, passing with amplitude inversely proportional to frequency. So its response is approximately proportional to both the strength of the periodic component, and to its closeness.

What next

I'll stop here for the moment. I'd like to deal with the effect of annual averaging, which fits in nicely, and was one of Willis' topics. I'd also like to deal with the effect of windowing, with the idea of making the side lobes diminish faster. I think it will help, but the current situation isn't bad.

26 comments:

AnonymousMay 29, 2014 at 4:49 PM
Very interesting Nick.

The sign of all this tends to get overlooked because everything is presented in terms amplitude.The negation of the sinc/sinc at Nyquist could be important.

Some posted last year on Curry's site about the possibility that hadSST or hadCRUT showing signs of aliasing. ( I cant find it in site search just now ).

What is the _sign_ of the side lobes at 0.4 and 0.6 ?

The will not be 0 or 180 phase , I know but have an equal but opposite phase shift about the central value.

But since all this is totally predictable mathematically , isn't there a clear means to test for the presence of aliasing in a precessed dataset ?

Greg.
ReplyDelete
Replies
William M. ConnolleyMay 29, 2014 at 6:42 PM
My impression of all this is that WE (and its a general failure at WUWT, and similar) are doomed because they don't know any of the prior art.
ReplyDelete
Replies
@whutMay 30, 2014 at 12:48 AM
The CSALT model that I developed uses a similar least-squares spectral analysis method of estimating Fourier components.
http://contextearth.com/context_salt_model/

I add an oscillating factor with frequency w as an unknown amplitude A*cos(wt) + B*sin(wt) to a multivariate regression analysis and include the A, B parameters to the model if the R2 error is reduced markedly.

Of course this has the problem of over-fitting unless one is committed to restricting to only those sinusoidal factors that make physical sense.

Wondering Willis became very upset with me when I pointed out how I applied his "discovery". What a nimrod.

btw, the outcome of the model is that CSALT does pick up the TSI oscillations, but they are at the 0.05 C level in amplitude, which is well below the secular trend of 1C due o global warming.

ReplyDelete
Replies
UnknownMay 31, 2014 at 7:27 AM
William Connolley,
Is your critique of Willis so hostile because you disagree with him politically or because what he is doing is incorrect? My guess is the former. Nick, to his credit, at least acknowledges that what Willis is doing is giving accurate results. Your approach toward all 'skeptics' is symptomatic of the poisoned pool of climate science; a pool in which you seem to so very comfortably swim. I do wonder if your goal is to advance understanding or to force arrival at your preferred political outcome. All evidence indicates the policy outcome is all that matters to you; sad.
ReplyDelete
Replies
UnknownMay 31, 2014 at 7:28 AM
Nick,
You need an edit function. Really.
ReplyDelete
Replies
UnknownMay 31, 2014 at 7:46 AM
Nick,
Thanks for the obliteration of the multiple deleted comments. I will try to type much more carefully on your blog.

I was in Turkey for several days until yesterday morning. 18 million people in Istanbul, and all wanting to increase their carbon footprint. Any approach to control fossil fuel use is going to have to address the desires of the billions who are now poor (or semi-poor), but who want to be rich.

Steve
ReplyDelete
Replies
AnonymousJune 1, 2014 at 4:37 AM
Webby, you seem to have some good maths knowledge so I hope you can see the following point.

A change in radiative forcing , whatever its origin, will produce an in-phase response that is a change in dT/dt, not a change in T. Flux is a power term not energy. That means that changes in F and T are orthogonal.

Once the system has had time to settle, at least nearly, to equilibrium you can look for a final change of T (delta T) that matches the new level of radiative forcing.

Now since F and T are orthogonal the correlation between them is ZERO.

What any regression analysis seeks to do is to find the correlation between the variables. That is not possible if the two quantities are orthogonal. Indeed this is sometimes used in an attempt to separate the two responses. eg Forster & Gregory.

Once surface temperature starts to change ( the integral of the induced dT/dt change ) there will be a radiative feedback. This will be in-phase with the instantaneous T.

So if you get a finite result when regressing T and Rad terms what it reflects is the climate feedback responses NOT the long term temperature increase delta T induced by changes in the forcing.

Do a multivariate regression and you have the same problem in spades.

Now not all your CSALT terms are rad forcing. ENSO is temperature but likely has non zero phase with global averages so you will not get the correct magnitude there either. Anyway, most are rad terms, so the whole exercise is doomed from the start.

Yes, it sort of looks OK because you have lots of variables with about the right mix of frequencies but that does not mean it's physically meaningful.It's just like regressing a bunch of suitable AR1 series would also "explain" most of the temp. record.

You are not alone, there are several published papers based on equally incorrect correlation analyses.

Lagged regression is similarly flawed because it fits neither one nor the other and as soon as we depart form orthogonality with one or the other, the correlation gets diluted and the derived regression coeff is an under-estimation.

Spencer & Braswell note the problem without finding a solution for the correct result.

I attempt to derive a more correct correlation by anticipating a simple "relaxation the mean" response here:

http://climategrog.wordpress.com/?attachment_id=884

Perhaps you could adapt that to CSALT

There is also the much overlooked problem of regression dilution due significant error and non linear variation in the x-variable. This means most cases of regressing one climate variable against another one will be fundamentally wrong also. Again under-estimating the correlation:

http://climategrog.wordpress.com/2014/03/08/on-inappropriate-use-of-ols/

Once you get into multivariate regression you have the problem in spades,again. The regression will be incorrectly fitting one variable and falsely attributing correlations to one or more others to minimise the overall residual.

Any resemblance of the result to the physical reality is purely fortuitous.

===========

Best regards, Greg.

ReplyDelete
Replies
@whutJune 3, 2014 at 6:45 AM
Goodman, Your criticism is only precautionary. Here is a very simple proof. If somehow the sun doubled the radiative forcing from what it currently is, I would EASILY be able to pick up that correlation in CSALT. However, it wouldn't be accurate because the resultant temperature is a nonlinear function of radiative forcing -- S-B law. Now consider that since the actual radiative forcing is incremental, I can always linearize the result as a Taylor series approximation. Add in a simple lag and voila, we have an estimate of the transient effect. We all realize that the fat tails of the equilibrium response are buried in the huge heat sink of the ocean, so we can safely ignore that for the time being.

So the proof is that the limiting case makes sense and the incremental view for transient response makes sense.

The physical view is that this is no different than a variational thermodynamics approach, which is well accepted.

I do realize that you are creating a "just-so" story about how volcanic eruptions have long term effects on the climate but I ain't buying that. The effects of the volcano condense out within a few years. CSALT picks that up amazingly well.

The fact that CSALT is a transient analysis should be pretty clear and it thus shows a lower-bound estimate to how much warming has occurred immediately due to the thermodynamic factors.
ReplyDelete
Replies

Add comment

An interactive topic index for all Moyhu posts.
Latest Ice and Temperature data
Climate Data Portals
A gallery of Javascript-enhanced graphics
Temperature trend viewer
Google Maps and GHCN
WebGL map of past GHCN/SST station temperatures
WebGL map of GHCN/SST station temperature trends
HiRes NOAA OI SST with WebGL and Movie
Regional Hi-Res SST movies
WebGL Facility
TempLS Guide
More pages, and blog glossary

moyhu

Thursday, May 29, 2014

Detecting periodicity

Detecting periodicity

Willis' method as a filter

Expression as an integral

Irregular points

Other ideas

What next

26 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Blog Archive

Translate

Resources

About Me

moyhu

Thursday, May 29, 2014

Detecting periodicity

Detecting periodicity

Willis' method as a filter

Expression as an integral

Irregular points

Other ideas

What next

26 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Subscribe To

Blog Archive

Translate

Resources

About Me