tag:blogger.com,1999:blog-7729093380675162051.comments2015-03-04T19:00:58.615+11:00moyhuNick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comBlogger4803125tag:blogger.com,1999:blog-7729093380675162051.post-31635794285701182382015-03-04T19:00:58.615+11:002015-03-04T19:00:58.615+11:00Oh, no, it's my error. You were applying W to ...Oh, no, it's my error. You were applying W to dT/dt not to TS Care needs to be taken with shifting the kernel up or down since this will be adding a constant to each element and hence adding a fixed dT/dt to the whole series. Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-77048930294274851812015-03-04T18:39:15.710+11:002015-03-04T18:39:15.710+11:00Consider the following: running mean of diff is s...Consider the following: running mean of diff is same as diff of RM. The kernel that achieves that has just two non zero values of opposite sign, one at each end of the kernel. Its the two point diff of the rectangular RM kernel.<br /><br />( This is in fact the same thing as the pairs of points Pekka referred to in the other thread, I'll come back to that ). <br /><br />As you have pointed out the OLS regression slope is the correlation of the data with a kernel of unit slope. If that is centred to be zero mean, the diff of that kernel is a positive rectangular window with a negative point as first and last element. These points are not unimportant and it cannot be shifted up to make it a simple rectangle. <br /><br />Your Welch window applied to TS is the same as ramp applied to dT/dt and so corresponds to the <b>OLS trend in dT/dt </b> ie the acceleration trend that you started out discussing, not the OLS trend of the TS. Again, this cannot be arbitrarily shifted up or down. <br /><br />Your integral maths was good but there is a logical error in how you interpret the terms. <br /><br /><br />Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-55354340703047127262015-03-04T17:40:54.062+11:002015-03-04T17:40:54.062+11:00For those here who seem to difficulty grasping the...For those here who seem to difficulty grasping the importance of negative lobes, this is not a pedantic detail. Here is the late 20th c. section of CMIP5 ensemble mean compared to its 15y running average.<br />https://climategrog.files.wordpress.com/2015/03/cmip5_180morm.png?w=800<br /><br />The El Chichon peak is neatly inverted and the Mt Agung one disappears !<br /><br />Around El Chichon - Mt Pinatubo period the variation in the mean is almost in anti-phase with the unfiltered model output. <br /><br />Now if one is to mangle the data in that way it may be that there is not "tracable imprint" of TCS in what is left. The conclusions of M&F2015 bases on 15y trends are ill-founded and worthless, Worse they are misleading. Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-30544336840305499162015-03-04T16:02:26.517+11:002015-03-04T16:02:26.517+11:00Thanks for your comments. Yes, WonkyPedia is a mov...Thanks for your comments. Yes, WonkyPedia is a moving target. I thought there was more on OLS there but it seems to have changed since last time I looked. <br /><br />There are other techniques but they usually require more specific knowledge about the error structure which is typically not available in the case of climate data.<br /><br />My point is it is best to remove the autocorrelation if possible and to a large extent taking derivative of temperature achieves this. One of the good points of M&F is that they are effectively using dT/dt. It's shame they corrupted the data with poor choice of filters to the point where they drew incorrect conclusions.<br /><br /><br />Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-65842521268021639952015-03-04T15:43:00.398+11:002015-03-04T15:43:00.398+11:00Thanks, What is the corner frequency you used the...Thanks, What is the corner frequency you used there? <br /><br />Eyeball check it looks OK except for first and last 15-20y. But to be more rigorous about this and how well it works theoretically, ie with any data sample, you need to ask questions about convergence not just state it's "never a problem".<br /><br />Specifically, all IIR filters have a spin up, a bit like climate models, they are iterative and need time to settle, or converge, to the correct answer. If you want to specify , say 1% accuracy, this can be very long on a fixed data set like climate where the filter period is a significant proportion of the sample length, eg 15y LP on 150y of data. <br /><br />It is analogous to having a FIR truncate the ends of data, except that the user of a IIR filter needs to determine the convergence limit and crop the data himself.<br />This makes it easy to overlook the issue.<br /><br />IIR filters are typically used in electronics and audio applications where the data is a continuous stream and this spin-up is just a turn on delay and it not a problem. It can not be ignored in TS analysis.<br /><br />There is also the question of frequency dependant phase distortion, because IIR is backwards looking, not symmetrical like most FIR low-pass filters. What the phase effect of back and forth is needs to be considered. In the region where both back and forward runs are converged, it will probably cancel out correctly, for the rest, not so.<br /><br />It also seems that back-and-forth approach is intended to address spin-up too, applying the converged filter to the early incorrectly filtered data on the return run. I'm not sure that this is valid and have not seen it discussed. <br /><br />The early data of the spin-up period is not just unfiltered, it wrong. Corrupted data. Passing it through a fully spun up filter will remove any unfiltered low freq but will not correct the data corruption. Similarly, running the correctly filtered end section through a non converged filter will corrupt what was good.<br /><br />So the result will have similarly distorted sections on each end. This is analogous to the FIR solution not being able to run up to the ends of the data, except that is pernicious.<br /><br />All these techniques like reflection about end points are speculative and assume telepathic knowledge of what the did before the system was measured and what it will do in the future.<br /><br />This is clearly unscientific speculation and IMO should be avoided in any serious work. <br /><br /><br /><br />Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-65972929658256733342015-03-04T11:45:36.045+11:002015-03-04T11:45:36.045+11:00Greg: since one of the conditions for OLS to be th...Greg: <i>since one of the conditions for OLS to be the statistically 'best estimator' is no auto-regression and temperature is highly autoregressive, that is another reason for not fitting trends to T(t).<br /></i><br /><br />Well, no. That's the reason you have to check for the effects of autocorrelation in the noise on your filter. E.g., "realistic Monte Carlo". I haven't found any cases where it makes much of a difference, so I haven't explored it further. The general theorem is that you still have a linear unbiased estimator, but it's no longer the most efficient, so it's LUE but not BLUE.<br /><br />If ou need to model the autocorrelation, generalized least squares methods exist which can handle the auto-correlation. Nick has some posts up on how to do this. The fastest numerical implementations use a modified Cholesky decomposition. I thought there was a Wiki article that discussed it, but I couldn't find it immediately. <br /><br />Anyway you can find quite a bit to chew on by searching for the terms:<br /><br />generalized least squares cholesky autocorrelation<br /><br />Carrickhttp://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-27744101295095163822015-03-04T11:24:41.080+11:002015-03-04T11:24:41.080+11:00Yes, I use an IIR implementation.
I do it "...Yes, I use an IIR implementation. <br /><br />I do it "forward and backwards" so really it's an eighth order acausal zero-phase implementation. I typically use reflection about the end points, and the low-pass, high-pass portions of the filter are set by specifiying the 3-db corner frequencies. Depending on the application, I will subtract the DC offset, then add it back after filtering (this is a flag).<br /><br />What I do with the forward-backwards + end-piont reflection is similar to what's done with MATALB's filtfilt. <br /><br />[However, I'm not as clever in my code about how many points to retain in the reflected region.]<br /><br />Here's an example of the output:<br /><br />https://dl.dropboxusercontent.com/u/4520911/Climate/Temperature/hadcrut4.cvt.smoothed.png<br /><br />The light blue (cyan) line is the 4th order Butterworth smoother output. Because of the way it is implemented and because I don't go beyond fourth order, convergence/stability is never a problem.Carrickhttp://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-47406574173346334812015-03-04T08:50:11.521+11:002015-03-04T08:50:11.521+11:00"Isn't this the same as saying that OLS t...<i>"Isn't this the same as saying that OLS trend is identical to mean of dT/dt ?"</i><br />No. As said in the recent post, trend is the Welch-weighted mean of dT/dt.Nick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-36615083766917060192015-03-04T05:48:54.006+11:002015-03-04T05:48:54.006+11:00"A useful property to remember about the ordi..."A useful property to remember about the ordinary mean is that it is the number which, when subtracted, minimises the sum of squares. There is a corresponding property for OLS trend. "<br /><br />Isn't this the same as saying that OLS trend is identical to mean of dT/dt ?<br />Greg Goodmannoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-12013902983307014632015-03-04T03:48:04.998+11:002015-03-04T03:48:04.998+11:00Pekka: "I'm not convinced that the temper...Pekka: "I'm not convinced that the temperature time series have properties that can be considered better in the frequency domain than in the time domain."<br /><br />since one of the conditions for OLS to be the statistically 'best estimator' is no auto-regression and temperature is highly autoregressive, that is another reason for not fitting trends to T(t).<br /><br />The white noise, if we are to make that assumption is in dT not in T(t)<br /><br /><br /><br />Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-28373113327906062742015-03-04T03:41:50.745+11:002015-03-04T03:41:50.745+11:00Thanks Carrick, how do you implement your Butterwo...Thanks Carrick, how do you implement your Butterworth ? IIR I would guess. How do you determine where it has converged ? How much of the TS do you loose at each end?Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-78696387330190633202015-03-04T02:41:41.871+11:002015-03-04T02:41:41.871+11:00Which approach is better (more pleasing) depends p...Which approach is better (more pleasing) depends partly on a persons experience as well as which is more clumsy for a given problem.<br /><br />I'm not sure here exactly which properties are better described with the time domain analysis. It is certainly much easier to read the transfer function (normally expressed as the complex ratio of the velocity spectrum of the output to the velocity spectrum of the input) to figure out what is happening on the shoulder (rolloff region) of the filter. <br /><br />The fact the phase is varying tells us that any wiggles you see are specious. They amount to a summation over varying amplitudes of the signal with varying phase shifts. In other words, just 'junk'.<br /><br />You might find it pleasing to leave these wiggles in your original series, but generally the trained eye is taught to ignore any of the variations that are more than 50% the roll-off frequency of the signal.<br /><br />So if you're interested in variations that are say 1/5 the frequency of the roll-off (say ENSO using a 13-month running-average filter), no problem.<br /><br />But it doesn't really inform you about the high-frequency variability of the signal other than in some very primitive manner.<br /><br />If I wanted to exhibit this in a time series in a more clean fashion, I'd use varying filter widths (e.g., 5-year, 2-year and 1-year) for example, but using a filter with a much higher roll-off (I use 4th order Butterworth typically).<br /><br />The running average shouldn't be eschewed in my view, but it is still a very poor filter.Carrickhttp://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-24482650211509534442015-03-04T02:32:33.428+11:002015-03-04T02:32:33.428+11:00Sorry… cross-eyed this morning when reading from m...Sorry… cross-eyed this morning when reading from my table of values: The factor 1.5 is for Hamming & Parzen (triangular). To more decimal places (just for completeness) you would use 1.33333, 1.66666 and 1.91667 ≈ 2 for Welch, Hann and Blackman.Carrickhttp://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-74799739678052563612015-03-04T02:21:46.401+11:002015-03-04T02:21:46.401+11:00Greg: Carrick, I'm not saying OLS regression o...Greg: <i>Carrick, I'm not saying OLS regression of a linear model should never be used in physics, it is a fundamental and valuable technique, though you don't see physicists calling it a "trend" , that has temporal connotations and is probably inherited from finance / econometrics. <br /></i><br /><br />I hope not. That's just crazy talk. Also, we don't use the word "trend" because we don't deal with temperature data. We could use the word "temperature velocity" here (first derivative of temperature amplitude with time)., but that's clumsy language, even if more precise. Generally people understand that "temperature trend" means the same as "temperature velocity".<br /><br /><i>There are very few situations in climatology where the necessary condition ( mathematical meaning ) that there be no systematic variation other than the supposed linear relationship, is met.</i><br /><br />If what you are interested in is a smoothed version of the first derivative of a measured quantity, using a sliding-window OLS filter is a perfectly reasonable method for arriving at this quantity. <br /><br />This filter can be improved by using tapered windows, but as Nick's analysis showed here, even the rectangular windowed OLS filter gives you a response which is formally equivalent to Welch-windowed running average of the derivative of the original function.<br /><br /><i>As for M&F, there is a significant component of the signal in the latter part of 20th.c. which will be inverted by a 15y RM or sliding-trend. </i><br /><br />I looked at this by comparing different window types and I didn't see this. I think part of the reason is the higher order fall off of the derivative filter (compared to a smoothed amplitude filter) and part is due to the 1/f-to-some-power spectrum exhibited by temperature fluctuations.<br /><br /> As I mentioned on another thread, if you want to compare the OLS filter using different choices for the taper, you do need to adjust the filter width so you get the same equivalent bandwidth for the filter. <br /><br />That is not hard to do with standard filters such as Welch, Hann and Blackman:<br /><br /> There are straightforward corrections that one makes here, which are you need to increase the filter length by by 1.3, 1.5 and 1.9 when you go from rectangular to Welch, Hann and Blackman in that order. (These are based on using a window centered well away from the DC bin, so probably these should be tweaked for a pure low-pass filter.)Carrickhttp://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-30187401075028287502015-03-04T01:39:45.754+11:002015-03-04T01:39:45.754+11:00The daily Feb 2015 NCEP anomalies moderated at the...The daily Feb 2015 NCEP anomalies moderated at the end of the month to levels that were high in 1998, so this will be interesting. JCHnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-23451241042011937012015-03-03T23:40:06.038+11:002015-03-03T23:40:06.038+11:00Nick,
I'm not convinced that the temperature t...Nick,<br />I'm not convinced that the temperature time series have properties that can be considered better in the frequency domain than in the time domain. That's possible, but my preference is to make the method maximally transparent in the time domain even if that leads to somewhat worse properties in the frequency domain. <br /><br />We may also ask several different questions that are best served by different approaches to the analysis.<br /><br />Very simple and robust methods tend to be best, when very little can be assumed about the underlying signal. Frequency domain is useful, when assumptions are better justified for the spectral properties than for effects seen directly in the time domain.<br />Pekka Pirilähttp://www.blogger.com/profile/04747229036782463233noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-82484487171849632442015-03-03T22:03:33.349+11:002015-03-03T22:03:33.349+11:00Pekka,
The paired differences that you describe ca...Pekka,<br />The paired differences that you describe can each be expressed as a cumulative sum of simple differences. So their weighted sums can be expressed as weighted sums of such differences. You're really asking that that sum should be orthogonal to cubic, quintic etc. There are plenty of dof to meet that and still have good smoothing.<br /><br />But I don't think it solves the main problem. In effect, in the frequency domain (FD) it makes the early part more exactly linear. But the main deviation from linearity comes when attenuation for HF starts. At that point, the response is neither a proper differentiator nor small. That is why there is an evident band pass character in the earlier time series. A more exactly linear rise would not help there.<br />Nick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-4545244902648244572015-03-03T20:24:51.207+11:002015-03-03T20:24:51.207+11:00Greg,
Differentiation doesn't involve any noti...Greg,<br />Differentiation doesn't involve any notion of error distribution. The error is secular, due to higher derivatives etc. Here, if anything, the model is differentiable function plus noise. You aim to differentiate the underlying and damp the noise.<br />Nick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-13351007285729699312015-03-03T20:08:54.176+11:002015-03-03T20:08:54.176+11:00To continue along the way I approach the issue.
A...To continue along the way I approach the issue.<br /><br />Assuming that the underlying function can be represented by a Taylor expansion around the point considered, we wish to determine the coefficient of the first order term. All even terms cancel out in every pairwise difference. Thus the higher odd terms are the problem. The third order term distorts the estimate the more the further we get from the point considered and higher odd order terms do the same even more strongly.<br /><br />In choosing the optimal weights of each pair we should take into account the relative strength of the noise and ow large we think the total influence of the higher odd order terms is - or more generally high large we expect the asymmetric deviation from the linear trend to be. Points where the expected deviations are much larger than noise should be excluded. Each pairwise estimate should have a weight that reflects the combined effect of noise and expected asymmetric deviation from linear trend (and takes into account also the denominator of the trend calculation).<br />Pekka Pirilähttp://www.blogger.com/profile/04747229036782463233noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-90893396481787938512015-03-03T19:49:18.347+11:002015-03-03T19:49:18.347+11:00What M&F did is essentially estimating the cha...What M&F did is essentially estimating the changes over the periods concerned. 15 year OLS trend is not very different from what could be determined by calculating the averages of the first 5 and last 5 years and dividing the difference by the time from the first period to the second period. For 62 year trend we could pick similarly 10-15 years from both ends. These values are perfectly meaningful and easy to understand. OLS trends are close to the same but have somewhat smaller statistical uncertainty as they use the information more optimally (at least assuming white noise).<br /><br />All estimates of derivative at a point discussed in the above can be described as follows:<br />- pick individual pairs of values symmetrically from both sides of the point considered and calculate the derivative from those values<br />- form a weighted average of these estimates.<br /><br />For a given footprint OLS has the smallest statistical uncertainty of such estimates. Some other weights give smoother behavior and are less affected by the size of the footprint. If the underlying process is expected to satisfy some specific smoothness properties (or have some spectral properties) then some other choice than OLS is probably better. All other choices suffer from the larger footprint that prevents getting properly close to the end point of the full time series.<br />Pekka Pirilähttp://www.blogger.com/profile/04747229036782463233noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-14298325571680517342015-03-03T19:40:54.351+11:002015-03-03T19:40:54.351+11:00It gives you an number. Now you need to find out w...It gives you an number. Now you need to find out whether it is a reliable estimation of the slope. <br /><br />http://en.wikipedia.org/wiki/Ordinary_least_squares#Assumptions<br />Non-autocorrelation: the errors are uncorrelated between observations: E[ εiεj | X ] = 0 for i ≠ j. <br /><br />In my "cosine" example the errors will be strongly autocorrelated. Similarly in the presence of any significant systematic variation other than the supposed linear relationship being sought. <br /><br />Yet again, the fundamental assumption is that deviations from the linear model are normally distributed. This can be allowed some flexibility but a whacking great cosine is going well beyond "flexible". <br /><br />Neither of the trends in my graph are legit and the difference is spurious. The cosine is not going anywhere , there is no "accelerated warming".<br /><br />Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-33888472387795877472015-03-03T16:55:19.549+11:002015-03-03T16:55:19.549+11:00Greg, from your attachment
"Fitting a linear ...Greg, from your attachment<br /><i>"Fitting a linear “trend” model to data from such a system, that is anything but linear in its behaviour, is invalid."</i><br />I don't think so. It gives you an estimate of the derivative at the centre of the range. Nothing wrong with that. It's only a problem if you think the derivative should be the same everywhere.<br />Nick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-66562378194757226892015-03-03T16:50:50.370+11:002015-03-03T16:50:50.370+11:00"Huh? That's a strange criterion for filt...<i>"Huh? That's a strange criterion for filter design. Surely it's the frequency response that is primary metric, not the implementation practicalities."</i><br /><br />Practicalities matter. Most climate series are relatively short. After smoothing they are one footprint shorter. Plus, as Pekka says, if you want to estimate a derivative, you want it to be as local as possible.<br />Nick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-87091241648974778932015-03-03T15:56:24.741+11:002015-03-03T15:56:24.741+11:00http://climategrog.wordpress.com/?attachment_id=20...http://climategrog.wordpress.com/?attachment_id=209<br /><br />Here is an illustration of why fitting trends in the presence of systematic variability is bunk ( as well as being bad maths ). Gregnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-15312203620137759932015-03-03T15:49:16.726+11:002015-03-03T15:49:16.726+11:00Nick, I've been saying for years that all this...Nick, I've been saying for years that all this should be done in dT/dt , no problem with that, though they do not present any reason why they are smoothing dT instead of T(t). I think this is just typical trendology. <br /><br />If the requirement is for general purpose LP filtering of dT/dt, derivative of gaussian seems the most suitable choice to me. Though one of your higher order Welch's would probably be equally good. <br /><br />Gregnoreply@blogger.com