Those, dear friends, are the clumsy fingerprints of someone messing with the data Egyptian style ... they are indisputable evidence that the "homogenized" data has been changed to fit someone’s preconceptions about whether the earth is warming.This created quite a stir, and drew a response from no less than the Economist, among others. It also drew a response from me - in fact, it was the stimulus to start this blog. I showed (following Giorgio Gilestro) that the V2 adjustments could be quite large, but were fairly balanced overall in trend effect. Darwin was an outlier, and I showed one case in particular which went in the opposite direction, to a greater extent.
So now an adjusted version of V3 is out, and there was a similar discovery at WUWT. This time it was Iceland, particularly Reykjavik, and I wrote about that particular case here. There were similar thunderings about rewriting of history etc (and calls for legal sanctions etc). What these protesters are reluctant to acknowledge is that GHCN has always produced two files - one unadjusted, one adjusted. The unadjusted file is not altered, and has at least until recently been the prime reference source. The adjusted file is derived fromn it using what seems at the time to be the best available algorithm. This changes.
What is also not understood is that honogenization is not done to correct the record. It is done to reduce bias prior to calculation of an index. Many irregularities occur in a large temperature record - instruments change, stations move. Records are patchy. Some of these artefacts will cancel, but there is a possibility of bias. So an algorithm is used to try to identify and correct them. In the process, there will inevitably be false positives. The process is valuable as long as the introduced errors have smaller bias than the errors removed.
In this post, I want to update (for V3) the statistical analysis of the effect on trend. I'll also produce a "cherry pickers guide" so that the stations which have their trends changed markedly up or down can be readily identified. I'll do this using the Google Maps application that I developed earlier for GHCN stations.
Distribution of trend changesFor GHCN V2, we showed a histogram of trend changes that was fairly symmetric with a slight upward bias - the mean trend increase from adjustment was 0.0175°C/decade. So I'll start with the corresponding plots for V3. This time I've selected three subgroups - stations that have histories with actual reports for at least
- 360 months (30 years)
- 540 months
- 720 months
Mean 0.0271 °C/Decade
Mean 0.0227 °C/Decade
Mean 0.0196 °C/Decade
The distribution is not quite as balanced as in V2, but the means are still small relative to the warming trend. The spread is larger, so that the Iceland adjustments were to extreme, but came just within the top decile.
The Google maps appI've included all the previous facilities, which are described in detail here. The Google Maps facilities are available, and you can select and color various groups of stations. The pattern is that you make choices, press the radio button to prime them, and then press one of the top color buttons - your chosen subgroup will be recolored according to the button you press. They combine with or logic, so it is better at first to make sure only one radio button is pressed at a time (they toggle). I've marked with gold fringes the buttons relevant to this topic. You can choose a range of large pos or neg trends and color them. You'll probably find it clearer if you first set everything to invisible. Anyway, I'll describe some detailed cases later. The trends are for stations with at least 60 years data; others will show -999 (for NA). Here's the app:
An extreme case of trend raisedTo find some extreme cases (mainly to check my code) I did the following sequence
- I selected All and clicked Invisible, to start with a blank plot.
- I unset All, and set the Trend Adjustment to 3°C/Century, and the inequality to >. Set the Trend radio button, then clicked Yellow.
- Then I changed the trend to -3°C/Century, and the inequality to <, then clicked Pink.
- I clicked on the stations to show the balloons with data. The US has 4 out of about 9 of these extremes, so I'll show it. The US generally has bigger adjustments, possibly due to USHCN. Corona, NM, was the whopper, at 8°C/Century (a world record, I think):
So I plotted the adjustment differences (adj-unadj):
and indeed, the adjustments, reaching 5°C create a strong positive trend. Why?
Well, here are the plots before and after adjustment:
These are annual means after subtracting unadjusted monthly means (to minimise the effect of missing values). There does indeed seem to be a sudden drop of about 3°C at about 1953, which may well be due to a station change. The further drop at about 1925 seems a bit more doubtful, and the big changes in the 70's have unclear effect, but it's not surprising that the algorithm reacted. In any case, the adjusted version is indeed calmer, and has lost the rather extreme downward trend.
An extreme case of trend loweredThough not as much. The one pink station in the picture is St George, and the adjustment brought the trend down by -3.2°C.
Here are the adjustments, reaching 4°C to make the trend more negative.
And here again are the plots before and after adjustment:
Again, there was an uptrend before adjustment, which turned it into a slight downtrend. The unadjusted is volatile, and although by eye I can't see obvious step changes, there is plenty for the algorithm to react to. Oddly, though, the homogenized result still seems to have substantial irregularities.
Population mattersThese were extremes. I though it was worth trying to see how such outliers could arise. But remember, the purpose is to remove bias prior to averaging. So the proper tests are statistical, not by studying extremes. The histogram above is a start.
The reason for the change in V3 is the availability of a new pairwise comparison algorithm from Menne and Williams. A lot of work has gone into homogenization, back to the early days of GHCN V1 (mid '90s). I'm not on top of it, and I think determined critics of the process have a lot of reading to do.
ReferencesAdded: I should give some more references here. There is a NOAA intro to GHCN V3. That has a download link. The GHCN data is updated several times a month as new data comes in. Each file is named according to date and other metadata.
The Intro links to their homogeneity adjustment page here.
As of Dec 15th, GISS is using the GHCN adjusted data. They seem to be currently adding their own adjustment (they used this previously with unadjusted data). I presume this will be rationalized.
The Menne&Williams paper on pairwise matvhing is linked above. The following paper would be of interest, but is paywalled:
Lawrimore, J. H., M. J. Menne, B. E. Gleason, C. N. Williams, D. B. Wuertz, R. S. Vose, and J. Rennie (2011), An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3, J. Geophys. Res., 116, D19121.
The older overview of Peterson is still worth reading.