Tuesday, February 7, 2012

A study of GHCN V3 homogeneity adjustments.

From time to time, bloggers discover that GHCN produces an adjusted temperature file, and are shocked to find that in the process, temperatures are altered. A noted example occurred in late 2009, when Willis Eschenbach became excited about GHCN V2 adjustments to the temperature at Darwin. He intoned sternly:
Those, dear friends, are the clumsy fingerprints of someone messing with the data Egyptian style ... they are indisputable evidence that the "homogenized" data has been changed to fit someone’s preconceptions about whether the earth is warming.
This created quite a stir, and drew a response from no less than the Economist, among others. It also drew a response from me - in fact, it was the stimulus to start this blog. I showed (following Giorgio Gilestro) that the V2 adjustments could be quite large, but were fairly balanced overall in trend effect. Darwin was an outlier, and I showed one case in particular which went in the opposite direction, to a greater extent.

So now an adjusted version of V3 is out, and there was a similar discovery at WUWT. This time it was Iceland, particularly Reykjavik, and I wrote about that particular case here. There were similar thunderings about rewriting of history etc (and calls for legal sanctions etc). What these protesters are reluctant to acknowledge is that GHCN has always produced two files - one unadjusted, one adjusted. The unadjusted file is not altered, and has at least until recently been the prime reference source. The adjusted file is derived fromn it using what seems at the time to be the best available algorithm. This changes.

What is also not understood is that honogenization is not done to correct the record. It is done to reduce bias prior to calculation of an index. Many irregularities occur in a large temperature record - instruments change, stations move. Records are patchy. Some of these artefacts will cancel, but there is a possibility of bias. So an algorithm is used to try to identify and correct them. In the process, there will inevitably be false positives. The process is valuable as long as the introduced errors have smaller bias than the errors removed.

In this post, I want to update (for V3) the statistical analysis of the effect on trend. I'll also produce a "cherry pickers guide" so that the stations which have their trends changed markedly up or down can be readily identified. I'll do this using the Google Maps application that I developed earlier for GHCN stations.

Distribution of trend changes

For GHCN V2, we showed a histogram of trend changes that was fairly symmetric with a slight upward bias - the mean trend increase from adjustment was 0.0175°C/decade. So I'll start with the corresponding plots for V3. This time I've selected three subgroups - stations that have histories with actual reports for at least
  1. 360 months (30 years)
  2. 540 months
  3. 720 months

Mean 0.0271 °C/Decade

Mean 0.0227 °C/Decade

Mean 0.0196 °C/Decade

The distribution is not quite as balanced as in V2, but the means are still small relative to the warming trend. The spread is larger, so that the Iceland adjustments were to extreme, but came just within the top decile.

The Google maps app

I've included all the previous facilities, which are described in detail here. The Google Maps facilities are available, and you can select and color various groups of stations. The pattern is that you make choices, press the radio button to prime them, and then press one of the top color buttons - your chosen subgroup will be recolored according to the button you press. They combine with or logic, so it is better at first to make sure only one radio button is pressed at a time (they toggle). I've marked with gold fringes the buttons relevant to this topic. You can choose a range of large pos or neg trends and color them. You'll probably find it clearer if you first set everything to invisible. Anyway, I'll describe some detailed cases later. The trends are for stations with at least 60 years data; others will show -999 (for NA). Here's the app:

An extreme case of trend raised

To find some extreme cases (mainly to check my code) I did the following sequence
  • I selected All and clicked Invisible, to start with a blank plot.
  • I unset All, and set the Trend Adjustment to 3°C/Century, and the inequality to >. Set the Trend radio button, then clicked Yellow.
  • Then I changed the trend to -3°C/Century, and the inequality to <, then clicked Pink.
  • I clicked on the stations to show the balloons with data. The US has 4 out of about 9 of these extremes, so I'll show it. The US generally has bigger adjustments, possibly due to USHCN. Corona, NM, was the whopper, at 8°C/Century (a world record, I think):

So I plotted the adjustment differences (adj-unadj):

and indeed, the adjustments, reaching 5°C create a strong positive trend. Why?

Well, here are the plots before and after adjustment:

These are annual means after subtracting unadjusted monthly means (to minimise the effect of missing values). There does indeed seem to be a sudden drop of about 3°C at about 1953, which may well be due to a station change. The further drop at about 1925 seems a bit more doubtful, and the big changes in the 70's have unclear effect, but it's not surprising that the algorithm reacted. In any case, the adjusted version is indeed calmer, and has lost the rather extreme downward trend.

An extreme case of trend lowered

Though not as much. The one pink station in the picture is St George, and the adjustment brought the trend down by -3.2°C.

Here are the adjustments, reaching 4°C to make the trend more negative.

And here again are the plots before and after adjustment:

Again, there was an uptrend before adjustment, which turned it into a slight downtrend. The unadjusted is volatile, and although by eye I can't see obvious step changes, there is plenty for the algorithm to react to. Oddly, though, the homogenized result still seems to have substantial irregularities.

Population matters

These were extremes. I though it was worth trying to see how such outliers could arise. But remember, the purpose is to remove bias prior to averaging. So the proper tests are statistical, not by studying extremes. The histogram above is a start.

The reason for the change in V3 is the availability of a new pairwise comparison algorithm from Menne and Williams. A lot of work has gone into homogenization, back to the early days of GHCN V1 (mid '90s). I'm not on top of it, and I think determined critics of the process have a lot of reading to do.


Added: I should give some more references here. There is a NOAA intro to GHCN V3. That has a download link. The GHCN data is updated several times a month as new data comes in. Each file is named according to date and other metadata.

The Intro links to their homogeneity adjustment page here.

As of Dec 15th, GISS is using the GHCN adjusted data. They seem to be currently adding their own adjustment (they used this previously with unadjusted data). I presume this will be rationalized.

The Menne&Williams paper on pairwise matvhing is linked above. The following paper would be of interest, but is paywalled:
Lawrimore, J. H., M. J. Menne, B. E. Gleason, C. N. Williams, D. B. Wuertz, R. S. Vose, and J. Rennie (2011), An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3, J. Geophys. Res., 116, D19121.
The older overview of Peterson is still worth reading.


  1. Its worth noting the the breakpoints are detected by a comparison to nearby stations. Those stations that show irregularities post-homogenization are likely cases where the irregularities show up in nearby stations as well (and may actually be a signal). Similarly, inhomogenities in stations with no nearby counterparts will generally not be detected.

  2. Nick, good to hear you have done this.
    I started writing a code to look through all the stations systematically but it's quite fiddly and I haven't had time.
    I disagree with your final point. No reading is required to see that the Iceland adjustments are wrong.

    I think Zeke needs to read the previous comments about the Iceland adjustments.
    There was a sharp cooling in 1965 in all the Iceland raw data sets.
    This was deleted in almost all cases by the erroneous GHCN adjustment algorithm.


  3. Zeke,
    Yes, and that would affect the interpretations here. But I think you do have to have a station irregularity first - then it is checked with neighbors.

    One of the objections raised about Iceland was that in about 1965 several of them were corrected in a similar way. It wasn't easy to see that as a neighbor-forced thing. I think that is what Paul is referring to.

    Paul, I don't know if R code helps but I'll try to scrub mine up and get it online.

  4. Paul,
    My point is, though, that a statistical evaluation is what is needed. There is a cost in noise to correcting apparent bias, as here. This may well be a false positive. The objective is to ensure that the noise introduced is less harmful (biased) than the noise removed.

  5. Nick,

    I meant that neighbors are used to detect irregularities in the first place (and later correct for them). Metadata is also used if available, though it simply reduces the threshold for breakpoint detection rather than forces a breakpoint, given that metadata can be wrong.


    I was simply pointing out how the Menne and Williams algorithm detects breakpoints. I'll read over your prior Iceland comments, but if multiple stations in the region show the same behavior at the same time it is unlikely that they will be corrected.

  6. Zeke, this is exactly what the fuss is about! If multiple sites in the same area show the same behaviour, then clearly they shouldn't be "corrected". But this is exactly what happens in the GHCN adjustments for Iceland. As well as the 1965 issue, there is the consistent warm period around 1940 where temperature was similar today, apparent in all the raw data and in the literature. But the GHCN adjustments put in a cooling here. See the sequence of posts at Paul Homewood's blog.
    There is something wrong - either the adjustment algorithm is 'over-zealous' or there is a coding error.

    I have the Lawrimore et al paper if anyone wants a copy.


  7. A noted example occurred in late 2009, when Willis Eschenbach became excited about GHCN V2 adjustments to the temperature at Darwin.

    Just wait until he discovers the upward trend adjustment at WILLIS ISLAND (off the northeast coast of Australia).


  8. Nick,
    GHCN have changed their version!
    A comment in the Changelog file says



    GHCNM v3.1.1 is released with changes in scripts due to asynchronous execution
    problems. Also added ability to add new stations automatically when their
    period of record increased enough to process and improved testing when not
    enough neighbors to estimate adjustments.


    I'm not sure what 'asynchronous execution problems' means.
    Anyway the adjustments are now significantly different, but no better.
    The erroneous 1965 adjustment now appears in all 8 iceland stations (before it was only 7/8)
    and the fabricated warming in Reykjavik is worse.
    But some stations eg Stykkisholmur are better.

    The warming adjustment at Corona NM is now "only" about 4 degrees rather than 5.
    Can you download the new dataset and run your code again to see where the biggest adjustments are now?


  9. Thanks for the note, Paul. Yes, I'll re-run.

  10. . . . Mean 0.0196 °C/Decade . . .
    That's a tad high, if you've only got +0.7°C over the last century or so. I think they're over-adjusting there.

  11. Anon,
    Yes, the adjustment does have some upward effect, and the updated algorithm (next post) raises it a bit more. But it's nothing like the impression you get from these extreme examples.

    The thing is, it may be right.