Comments on moyhu: Flutter in GHCN V3 adjusted temperatures.

One could. I will not do it because I have seen no...

2017-02-21T01:57:39.470+11:00

One could. I will not do it because I have seen nothing that would convince me that this is in anyway a problem. But if someone has some precious life time to waste: be my guest.

It could be that the "flutter" is smaller for such benchmarks because their signal to noise ratio is for Europe and the USA, which is larger than for the middle of Australia or Africa.

The results on a benchmark will on average be the same whether you have one year more or one year less data, but individual stations may well be sometimes different. There is nothing special about the current length. (Large changes in the length and network configuration naturally do start to matter; like I wrote above just 10 years of data is not well suited for homogenization.)

Victor, Do you think it would be a good test to c...

2017-02-20T22:01:22.222+11:00

Victor,

Do you think it would be a good test to check whether the same flutter properties are exhibited when homogenising synthetic benchmarking data? That would presumably be a good cross-check of the validity of the benchmark test setup.

That was a long list of questions and had to wait ...

2017-02-20T05:47:23.491+11:00

That was a long list of questions and had to wait for a quiet moment in the weekend.

You will in most cases not be able to detect all breaks, but station temperature data is expected to have one break every 15 to 20 years.

Just going to monthly data does not have benefits over annual data. Monthly data is also more noisy and you have all the problems I mentioned above. However, there are inhomogeneities that only have a small effect on the annual mean, while they have a clear effect on the annual cycle. You could improve detection of small inhomogeneities by including breaks in the seasonal cycle; people working manually typically do so.

You are right that errors accumulate over time and are largest in the early period. Not only because of error accumulation of the corrections, but also because the network was much less dense then and the nearby stations are thus less nearby making the difference time series more noisy. This makes detection harder and corrections more uncertain.

Every developer of a homogenisation method has naturally checked how well it works. I was the first author of a large and blind study comparing many homogenisation methods and for temperature it improves the trend estimates. NOAA's pairwise homogenisation method also participated and was one of the recommended methods. People have compared the US data before and after homogenisation with the US Climate Reference Network. After homogenisation it fits better.

NOAA made a similar blind test as mine for the US and could show it improves the trend estimates (but some of the bias remains). On that same dataset also the method of Berkeley Earth was tested and it compared similarly well for the US. The International Surface Temperature Initiative is now working on making a global validation dataset.

VIctor and Steve: Thanks for taking the time to r...

2017-02-14T19:50:53.771+11:00

VIctor and Steve: Thanks for taking the time to reply. Victor: If the NOAA PHA only looks at annual averages, does that limit you statistical power to identify a breakpoint? I vaguely remember that some algorithms were finding as many a one breakpoint every one or two decades. In that case, you won't have very many data points defining a breakpoint surrounded by two stable relationship in a pair of stations records. Getting the overall trend correct depends on getting the correction at the breakpoint right. For the 20th-century, you could have a half-dozen or more breakpoints. If each adjustment came with a confidence interval, then the uncertainty in the overall change (and trend) is going to be really high.

Whenever I've look at BEST aligning split records with the regional expectation, it seems to take only two or three breakpoints for the trend of aligned record to appear to perfectly match the trend of the regional expectation. Or at least it looks that way in the final product - which I think is smoothed over 13 months. I recognize that the regional expectation is derived from kriging unadjusted individual records - not by averaging the aligned records. Nevertheless, it is distressing to see how easily the segments from a flawed record can be aligned to agree with a particular trend. And if the record one is aligning against is biased ... I'm not saying I believe this is what happens, but it is in the back of my mind.

Has anyone looked to see if the overall trend of stations varied with the number of corrected breakpoint in the record?

Thanks, Frank

Yes A while back I was looking at what our algori...

2017-02-14T06:45:08.750+11:00

Yes A while back I was looking at what our algorithm did to CRN stations ( a gold standard) in 5% or so of the cases we were adjusting them. It had to do with our recalculation of seasonal cycles for stations.

We havent finished looking at it , priorities and all that

V4 will not adjust arctic stations. In the pa...

2017-02-14T06:42:08.553+11:00

V4 will not adjust arctic stations.

In the past ( in Iceland for example) it was found that certain stations had abrupt discontinuites that were related to retreat of ice cover. ( Ask Zeke he went to iceland to talk to them about one case) any ways, the algorithm saw a break and "fixed" it, but actually the change was real with a real physical basis

It is quite common for a station to have a differe...

2017-02-14T06:02:24.041+11:00

It is quite common for a station to have a different seasonal cycle as its neighbors. Not only in the mean, but also in how strong the correlations with other stations are, which produces a seasonal cycle in the noise of the difference time series. To remove these effects is difficult because they can also change at a break point.

NOAA's pairwise homogenization method only looks at the annual average temperature. National datasets, especially manually homogenized datasets, often also look at the size of the seasonal cycle, or at the series of the summer mean or the series of the winter mean. This avoids problems with the annual cycle and the correlations in time of monthly differences is higher.

I had expected BEST to do the same as NOAA; their paper say they follow NOAA, but it is not clear to me whether they use monthly or annual data. They went out of their way not to hire anyone with relevant expertise to appease the mitigation skeptics. So maybe they used a sub-optimal method using monthly data. Will page Mosher on Twitter to ask.

Nick and Victor: When I look at BEST's plots ...

2017-02-13T20:46:39.336+11:00

Nick and Victor: When I look at BEST's plots of the difference between station data and the "regional expectation", there often seems to be a strong seasonal signal. Due to local environment, during summer a station may be warmer than average for the region and the opposite in winter. When a breakpoint detection algorithm is on the verge of reporting a shift to warmer readings, that shift is most likely to be detected in the summer. The following winter, there may be less confidence that a breakpoint has been detected. FWIW, this seems to be one mechanism that could cause "flutter" in the homogenized output from some stations.

Frank

thanks much for your excellent response. i would n...

2017-02-12T09:36:13.842+11:00

thanks much for your excellent response. i would never have thought there'd be a wiki entry for it, but there it is. There's something a bit bizarre about casting aspersions on an influence which is known and part of the data but maybe peripheral to the process being studied.

Nuisance parameter as defined in wikipedia https:/...

2017-02-12T01:22:30.199+11:00

Nuisance parameter as defined in wikipedia
https://en.wikipedia.org/wiki/Nuisance_parameter
"any parameter which intrudes on the analysis of another may be considered a nuisance parameter."

Examples of nuisance parameters:
1. Periodic tide effects when trying to measure sea-level height increases
2. Daily and seasonal temperature excursions when trying to measure trends

ENSO is a nuisance parameter because it gets in the way of measuring global temperature trends. They compensate for the two examples above but not ENSO, presumably because it is not as easy to filter and they don't know how much to compensate for it. I say just do the compensation anyways.

Olof, Yes. I looked at V4 unadjusted here (Google ...

2017-02-11T19:18:32.181+11:00

Olof,
Yes. I looked at V4 unadjusted here (Google map here). But I haven't really looked at the adjusted version. I'll start saving some files.

nuisance variable?

2017-02-11T15:13:34.848+11:00

nuisance variable?

Nick, I understand your analogy but think it stil...

2017-02-11T13:41:38.244+11:00

Nick, I understand your analogy but think it still doesn't justify the instability shown by Paul Matthews. You want to "model" turbulence for a stable calculation. So you smooth and time average it.

The adjustment methods seem like a sophisticated form of interpolation and averaging. It should be a smoothing operator, not one having high sensitivity to small additions of later data. I still think that's a reason for NOAA to really do a thorough audit of their method. The "turbulence" here is not in the modeled data but is introduced by the unstable adjustment algorithm.

The problem is that David Young's wind tunnels...

2017-02-11T10:53:00.607+11:00

The problem is that David Young's wind tunnels don't operate underwater.

Bob Koss tried to post a comment, but ran into tro...

2017-02-11T10:50:39.201+11:00

Bob Koss tried to post a comment, but ran into trouble. I have posted it as an appendix to the main post above, to preserve the format.

Victor, thanks for the additional info. I vaguely...

2017-02-11T10:48:38.113+11:00

Victor, thanks for the additional info. I vaguely remember seeing something about that comparison last time I visited the USCRN web site over a year ago. I've been meaning to go back to update data I downloaded for Texas area stations. I went to the link you provided, but it appears to be paywalled. However, I searched the title and found a publicly available PDF: here (in case anyone else is interested).

David, "As I said above, an unstable CFD code...

2017-02-11T10:39:34.809+11:00

David,
"As I said above, an unstable CFD code is perfectly useless"
Yes, but I don't believe this is an unstable code. It is an algorithm that generates a somewhat chaotic pattern. That is why the analogy with turbulence. There is a fine scale on which you see chaos, but on the scale you are interested in (spatial mean, of flow or temp) that washes out, and the result does not reflect the local instability.

Nice work Nick, I think you should redo this exerc...

2017-02-11T10:10:08.618+11:00

Nice work Nick,
I think you should redo this exercise with GHCNv4.
I'll bet that the relative frequency and magnitude of the flutter will be much smaller in v4.

V4 does a much more sensible adjustment in Alice Springs (If we can accept that it discards all data before 1941, I dont know why, but I believe that the station moved from the town to the airport then)
https://www1.ncdc.noaa.gov/pub/data/ghcn/v4/beta/products/StationPlots/AS/ASN00015590/

I have seen that GHCN v3 can do strange things with remote lonely stations, for instance those in the high Arctic. I believe that GHCN v4 will be a general remedy for this kind of problems. If the lonely stations are supported by new neighbour stations, it will be easier for the PHA to "decide" if the temperature changes are real or not..

Victor, Its the same issue we studied a couple of...

2017-02-11T08:57:32.772+11:00

Victor, Its the same issue we studied a couple of years ago in AIAA Journal. We found that extremely small details caused dramatically different answers in our CFD codes for one problem. We were able to document that the problem itself was singular and that the codes were OK, but only with very careful analysis and actually seriously looking for negative results.

You need to look at Paul Matthews information and then look to duplicate the anomalous behavior. Then one would want to change the algorithm to stabilize it.

Nick, I think the turbulence issue is different i...

2017-02-11T08:54:28.678+11:00

Nick, I think the turbulence issue is different in character than data adjustment algorithms. In steady state RANS you model the turbulence to make it a steady state BVP and in that context, you want stable numerical methods. So for example if I changed the grid a little, I want the answer to only change a little. It's muddier in time accurate simulations.

As I said above, an unstable CFD code is perfectly useless and people would jump to find and fix the problem by finding some way to "stabilize" the algorithm and/or understand if the problem is singular, etc.

Nick Stokes: "I think overall it would probab...

2017-02-11T08:50:13.525+11:00

Nick Stokes: "I think overall it would probably be better if NOAA didn't publish adjusted values at all, but that this was left as an intermediate stage in integration, which is where it belongs."

Agree on the one hand, homogenized data is not homogeneous station data. Homogenized data gives an improved estimate of the regional climate. The short-term variability is still the one of the station.

What I like about homogenized data is that it improves the transparency of the climate data processing. You can clearly see what this step in the processing does.

In addition people can quickly make an analysis of the specific question they are interested in without having to do the homogenization themselves every time. Weather services cannot pre-compute all numbers and graphs people may need.

David Young, I would not know what to study. What ...

2017-02-11T08:41:21.289+11:00

David Young, I would not know what to study. What would be your hypothesis? "Does a yes/no process lead to yes/no results?" Not sure if the answer to that is publishable. :-|

There are naturally many studies on the noise level and how that determines the probability of correctly finding a break and the false alarm rate. Or on how the signal to noise ratio determines how accurate the position of the break is. Or on how much homogenization improves the trend estimates, if I may plug my blind benchmarking study:
http://variable-variability.blogspot.com/2012/01/new-article-benchmarking-homogenization.html

David, I would see the instability as an analogue ...

2017-02-11T08:12:07.724+11:00

David,
I would see the instability as an analogue of turbulence. It is a confusing factor if you really want to find high resolution velocities. But you can still perfectly well work out the mean flow, and that determines what you often really want to know in the wind tunnel.

Good grief. It's getting hot DY. Have you not...

2017-02-11T08:07:32.327+11:00

Good grief.

It's getting hot DY. Have you noticed?

That last comment was directed to Victor. Has thi...

2017-02-11T07:32:13.730+11:00

That last comment was directed to Victor. Has this instability issue been examined in the literature? I really want to know.