Saturday, May 2, 2015

An early GHCN V1 found - hardly different from latest V3 unadjusted

Clive Best has come up with an early version of GHCN V1, which he says is from around 1990; GHCN V1 was released in 1992, and I think this must be from a late stage of preparation. Anyway, he has kindly made it available. I have been contending that GHCN unadjusted is essentially unchanged , except by addition, since the inception, despite constant loud claims that NOAA is constantly altering history.

I thought I would compare with the latest GHCN V3 unadjusted, which has data up to March 2015. This turns out to be not so simple, as the numbering schemes for stations have changed slightly, as have the recorded station names. Stations have a WMO number, but several can share the same number in both sets. They have a modification number, but these don't match.

So for this post, I settled on just one country, Iceland. GHCN is supposed to have done terrible things to this record. A few weeks ago, we had Chris Booker in the Telegraph:
"When future generations look back on the global-warming scare of the past 30 years, nothing will shock them more than the extent to which the official temperature records - on which the entire panic ultimately rested - were systematically "adjusted" to show the Earth as having warmed much more than the actual data justified.
Again, in nearly every case, the same one-way adjustments have been made, to show warming up to 1 degree C or more higher than was indicated by the data that was actually recorded. This has surprised no one more than Traust Jonsson, who was long in charge of climate research for the Iceland met office (and with whom Homewood has been in touch). Jonsson was amazed to see how the new version completely "disappears" Iceland's "sea ice years" around 1970, when a period of extreme cooling almost devastated his country's economy."

This was reported by NewsMax under the headline:
"Temperature Data Being Faked to Show Global Warming"

The source, of course, was Paul Homewood. Euan Mearns chimed in with:
"In this post I examine the records of eight climate stations on Iceland and find the following:
There is wholesale over writing and adjustment of raw temperature records, especially pre-1970 with an overwhelming tendency to cool the past that makes the present appear to be anomalously warm."

So what did I find? Nothing that could be considered 'systematically "adjusted"'. There were differences, mainly with a clear reason. I'll list the eight stations below the jump.

Clive's V1 is accessible here. V3 in CSV format is here (10Mb zipfile). I have put the extracts relevant to Iceland on a small zipfile here.

A point of comparison is with the IMO records here. These, however, are not entirely unadjusted. They can be useful for resolving differences.
    These two of the eight are very simple. Exactly the same in V1 and V3.
    This is a long record, 1620 months to 1990. Of these nine have been changed. Eight are sign changes, one is a minor change. In all cases, V3 and the IMO record agree. Clearly V3 corrects errors in the V1 draft.
    This is also a long record. There are a lot of apparent changes. However, the real situation is revealed by a table of their magnitudes, in the top row, vs number of occurrences in the bottom:


    Almost all changes (211) are by ±0.1. That is, a rounding difference, presumably from monthly averaging.
    Again, a fair number of rounding differences, but at most four larger (>0.1) month changes in a long record
    This actually had quite a lot of small changes - 159 in a 461 month record. But I found what is happening by looking in ver 2, which records duplicates. These are different records for the same location, but may be different equipment or measuring points. The V3 version and V1 version are both recorded as duplicates in V2. In V3, they chose whichever seemed best.
    At this stage, I have run out of investigative zeal. Here's the frequency table

    0 0.10.2 0.3 0.4 >0.4
    1050 37 28 20 11 4

    There are 100 discrepancies in 1150 months, small, but several bigger than rounding. No obvious reason.

It is clear that there is no "wholesale over writing and adjustment" in the GHCN V3 unadjusted file for Iceland.


  1. So the big trick, as Steve Goddard has inadvertently taught us, is to look only stations that are present in both data sets.

    Any chance you can compute V3 trends using just the V1 data set then compare to trends from the V1 data set?

    1. Carrick,
      I could do all V3 vs all V1. But then any trend difference is likely to be due to the different stations included. I could try to do the set that is common to both, but that comes back to the variant numbering issue. It's hard to align the whole set. Even with Iceland there was the issue of the two datasets for Keflavik. The inclusion of USHCN in V3 is another related complication.

    2. Thanks Nick.

      Sounds like a real research project then. Doable, but challenging.

      Depending on what data is available, probably a correlational method (on the raw data) could be used to match up stations.

  2. Nick,

    It's interesting that you are blogging more or less simultaneously on the "scandal" of GHCN data adjustment and the "first rate science" of Christy and Spencer's satellite data adjustment. While the ability of humans to hold two diametrically opposed beliefs in their minds at the same time with no awareness of the inconsistency is well known I do wonder how they can continue to do so when the "opposition" is so obvious. In particular I notice that Homewood has just commented at wuwt claiming that the latest UAH adjustment, by bringing UAH more in to line with RSS, actually strengthens the view that the satellite based temperature series are to be preferred to the surface data - based series.


  3. JCH: given that the satellite datasets are much more sensitive to ENSO (presumably because the lower troposphere is more sensitive than the surface), and given that the 2009-2010 winter had the highest MEI index since 1998, I would argue that the discrepancy between RSS and HadCRUT over that time period doesn't indicate that the records are of different quality. They might be: it is entirely possible that the satellites are underestimating or the surface record is overestimating warming - I'd bet on the satellites being more wrong - but the 2010-2015 record isn't sufficient to provide any evidence for that.

    Also - did you mean to plot sea surface temps for HadCRUT, and land temps for RSS? I would think you'd want apples to apples, and look at global for both.


    1. Apples to apples probably isn't possible here:

      Satellite based measurements average over an extended profile above the surface (to the degree they are accurate).

      Surface measurements combine 2-m above surface land measurements with sea surface temperature.

      Beyond that the interval JCH has chosen is too short to be meaningful.

      So the short answer is: 100% chance of happening (it did happen), much lower percent chance there is anything diagnostic in this particular comparison.

  4. Been there, done that:


    1. Thanks, Zeke. Looks like I wasn't paying attention at the time. You've covered it very well, Fig 4 seems to do the alternative recon that Carrick was requesting.

    2. Its not ideal, as the lat/lon matching is somewhat imperfect. Also GHCN v3 adjusted has changed significantly between 3.0 and 3.2.2, with the fix in the homogenization after Rothenberg's work on the algorithm, though it won't impact the raw v1 vs. v3 comparison.

    3. Actually I'd like to look at how the adjustments have changed.

    4. The GHCN tech reports would be a good place to start:

      Not sure if there are old versions prior to GHCN v3.2 publically available anywhere though...

    5. Carrick,
      I'm sure there were no adjustments being made in 1992. Homogenisation requires masses of organised, digitised data. That's what these people were creating. An adjusted file appeared some time during the currency of V2.

    6. I should add that Jones was homogenising earlier with his own resources. And the Met offices did much of the initial digitising. But GHCN V1 was an archiving project.

    7. Thanks Nick, I forget how far we've come.

  5. There are some errors in Clive Best's csv data file (posted this there already). I'm correcting them now, and will post shortly when I have completed the corrections, and make the corrected file available. The errors seem to be confined (so far) to stations with data earlier than 1800, where the station id and year have not been separated, giving a "new" station id and data for January to November only. The earliest data appears to come from 1701.

    I have another v1 data set dating from 1994, but also with data just to 1990. When I've completed corrections I'll compare the two. I suspect that they may be the same data, but with slightly more metadata.

    Carrick and Zeke: I have some old versions prior to GHCN v3.2 archived, from v3.0.0, v3.1.0, v3.2.0, v3.2.1 and v3.2.2, qcu and qca, and a nearly complete daily archive from April 10 2014 on, taken at midnight GMT (incomplete on a few days when I had no internet access. Contact me if you want some old versions.

    Have not looked at this for v3.2.2, but the adjustment for at least one earlier version showed quite substantial jumps in monthly values from day to day for at least one Irish station I was looking at. The station I was looking at stops some years ago in GHCN, and I looked at the adjusting stations used by Gistemp to see whether any of those showed similar jumps from day to day, finding if I remember only smaller changes (I know GHCN adjusts differently from Gistemp - I chose to look at the Gistemp adjusters for convenience)


    1. I have put a copy of v2 adjusted from Dec 2009 here.

  6. This is identical to the version I downloaded from, dated 28 July 1992 (1994 above was my memory at fault)

    The corrected data, both as csv and txt (extension .doc, added to enable WordPress upload, may be deleted - these are not Microsoft Word documents):

    I'll post additional metadata later today.

    1. I think this is the version that Zeke and Mosher analysed.

  7. The additional metadata (again extension .doc added which may be deleted):

    Station names in this file differ from those in Clive Best's csv file in that some contain commas, and so are unsuitable for reading as a simple csv file. (Note that one station in Clive Best's csv file, CENTRO MET.ANTARTICO"VICE, contains a double quote mark which may cause a problem when the csv file is read). The latitude and longitude coordinates for each station are identical in the two versions, as are the start-tears and end-years.

    Four additional values are added for each station. The elevation follows the longitude. Two additional values from the original inventory follow the end-year. These areas described in the readme file:

    MISSING is the percent of the record with missing data.

    DISC is a code which can be used to identify a time series which
    contains a "gross" discontinuity (i.e., one which was readily
    identified when the time series was plotted and analyzed
    visually). If DISC is 1, then the station has a major
    discontinuity. If DISC is 0, then the station has no major
    discontinuities. However, it could still contain more subtle

    Finally, I have added a nightlight luminance for each station, for anyone who may wish to try adjusting the data following Gistemp procedures. These luminance values are taken from the F16_2006 version rather than the deprecated earlier version still used by GISS. (If there is a demand for this, I can generate luminance values using the deprecated version and add these to the file). Generally, with a relatively small number of exceptions, urban/rural classification is the same for both F16_2006 and deprecated versions. Experience with GHCN v3 indicates that it is correction of location coordinates which leads to more frequent classification changes. With GHCN v3 approximately 20% of stations outside the US, Canada and Mexico which are also WMO stations show changed urban/rural classification when the WMO coordinates are substituted for those in the GHCN inventory file. The coordinates used to determine luminance correspond to the latitude and longitude coordinates given in the inventory file, and as these coordinates have not been corrected the luminance values may in some cases correspond to a location sufficiently distant from the station to give a misleading urban/rural classification. 2034058101 KUWAIT INTL AIRP is a good example of erroneous coordinates, located at sea rather than at the airport. I have not corrected any coordinates in the v1 inventory file, and do not at present plan to do so. (I am gathering corrections for the v3 inventory coordinates).

  8. WUWT has a post up by David Archibald, which is based on a guest post by some guy named Mike Brakey at Pierre Gosselin's site ( It's astoundingly bad.

    Brakey claims that NOAA is tampering with the historical temperature data for Maine, USA. He shows what he claims to be 2013 and 2015 versions of the annual mean temperature data. But something is clearly wrong, since both versions include a data point for 2014. That would have been a neat trick by NOAA in 2013. His 2015 data don't seem to match what I get when I look up average annual temperature for Maine at NOAA here (

    But worst of all, his table of old vs new versions of temperature data lists the change in degrees F, and also AS A PERCENTAGE. For example, for 1895 he claims that the old vs new versions of NOAA's data differ by 2.2 F ... and shows this in bold red text as "5.1%" (because 2.2 is 5.1% of the annual mean of 43.3 F).

    Gosselin describes Mr Brakey as an "engineering physicist and energy expert". The mind reels.

    Brakey seems to be based in the city of Lewiston, because he includes a column with alleged temperature data for Lewiston alongside the statewide data. They don't seem to match actual data from Lewiston, and somehow he fails to mention that the GHCN adjustments to Lewiston greatly *reduce* the trend (I learned this from your GHCN viewer...)

    All in all, it's quite a piece of work.

    1. Unfortunately Anthony's slide from any glimmer of respectability continues. That particular post was excruciatingly bad.

    2. There are posts where, while I may disagree with Anthony or his guest writers, I can at least understand why they were posted. But then there are these where I just can't fathom it. Archibald and Monckton are two obvious and frequently repeated examples. Is there no one in the management at WUWT who can see how silly Archibald's posts are?

      On a different note ... Nick, have you seen this?

      Seems nifty, though I've only just begun to try it out.

    3. Ned,
      I had a bit to say myself on that WUWT thread.

      I hadn't seen the UQ tool. It's functionality is somewhat similar to my Google Maps tools. But Google Maps is a better base, especially with popup info etc. I also did similar things with KMZ files in Google Earth. Searching Moyhu for kmz (or google earth/maps) shows up most of it.