Friday, September 17, 2010

Beta version of GHCN v3 is out.

h/t CCE (at Whiteboard).
I've only recently read CCE's comment - I've downloaded v3 from here. It's late where I am, so this is very much a first impression.

The README file is helpful. The inventory file has a new layout, so TempLS will need some changes to read it. It seems to have basically the same data, though, about the same 7280 stations as v2.

The data file also seems to have much the same actual data. But interspersed are a number of codes. There is a measurement code and a quality code, but not yet used, except for USHCN. Then there is a source code (saying where the data comes from).

So there's work to do to get TempLS to read it. It's not clear that this beta version has data changes that will affect the results. But we'll see.
Update below the jump:

Update: I found this useful set of slides from a talk by Dr Karl in May 2010. It has several plots based on V3. I had been planning to do a v2/v3 comparison, but there's one there:
Slide 21 - units C, blue v2, red v3

There's also a more thorough examination of v3 from Zeke, referenced in his comment. He has also done the v2/v3 comparison, as well as a station count (V3 has more readings, but no new stations), and a look at adjustments.

Mosh says there that the main diff with the dataset is that they have eliminated many (all?) duplicates. Indeed, there are 443933 lines in the file, vs 597182 in v2.mean.


