Friday, October 23, 2015

Looking at GHCN V4 beta

The long-awaited GHCN V4 is out in beta, here (dir). The readme file is here. It is a greatly increased dataset, which transfers a lot of data from The large daily archive to monthly. There are 26129 stations in the inventory, instead of 7280. 11741 reported in September, where usually only about 1800 report in GHCN V3. But it is not so clear that the extra numbers add a great deal. In V3, the stations were reasonably evenly distributed, except for a lot in the US, and some bare patches. In V4, there seem to be more regions with unnecessary coverage, and the bare patches, at least in stations currently reporting, are not much improved.

I'll be interested in practicalities such as how propmtly the data will appear in each month. GHCN V3 was prone to aberrations in reporting; that may get worse. We'll see. Anyway, I've run it through TempLS grid. The initial run of the mesh version will take several hours, so I'm waiting to get some decisions right. It's possible I should opt for a subset of stations, and I'll probably want to modify the policy on SST stations. Currently I reduce from a 2°x2° grid to a 4°x4°, mainly because otherwise SST would be over-treated relative to land. This is partly to reduce effort, but also to reduce the tendency for SST values to drift into undercovered land regions. Now the original SST grid is comparable to land, so the case for saving effort is less. The encroachment issue may remain.

The other issue is whether the land data should be pruned in some way. I think it probably should.

Anyway, I'll show below the WebGL plot, with land stations marked, for September 2015, done by TempLS grid. I think it is the best way to see the distribution. You can zoom with the right button (N-S motion), and Shift-Click to show details of the nearest station. The gadget is similar to the maintained GHCN V3 monthly page, which you can use for comparison. You can move the earth by dragging, or by clicking on the top right map. Incidentally, the gadget download is now about 4Mb, and may take a few seconds. I'll work on this.

The plot shows the shaded anomalies for September. Incidentally, the average was 0.788°C, vs 0.761°C for V3. Generally, the differences in TempLS grid are very small. August was almost identical. You can see that some regions have very dense cover, for example, Germany, Japan, Australia and the US. Regions that were sparsely covered in V3, such as the swathe through Nabibia, Zaire etc, are still sparse. I don't see much improvement in Antarctic, and Arctic has some extra, but still not good coverage.

Here is a list of the top 20 countries reporting in September. You can see the excessive numbers in the US, Australia and Canada, and proportionately, in Germany and Japan. Almost half are in the US. I think I will have to thin these out. The inventory, however, is now just bare data of lat, Lon, Alt and name. So it isn't easy to pick out rurals, for example.

5212 United States
711 Canada
577 Australia
495 Russia
213 China
152 Japan
152 Germany
82 Spain
79 India
76 Brazil
73 Indonesia
69 Argentina
62 Mexico
61 Kazakhstan
50 Turkey
49 Ukraine
49 Algeria
47 France
44 Thailand
41 Mongolia


  1. I was aware of the large gaps in north and central Africa, but it's striking to see the missing coverage in the Brazilian Amazon and in much of the Canadian north. While I can understand the situation for Angola, Zambia, the DRC, the Central African Republic, Libya, Somalia and Yemen, both Canada and Brazil are developed wealthy countries (with agriculture, forestry and natural resources constituting a significant part of their GDPs) easily able to afford a dozen or two automated meteorological stations to fill in the gaps. Apparently this is not a priority.

    1. Well, Canada had 711 stations overall reporting in September. It's big. And remember that GHCN V4 probably requires a reasonably long record for inclusion.

  2. GHCNv4 is, effectively, just GHCN-Daily with monthly means. That said, it should help with things like pairwise homogeneity that benefit from more stations in the network, but I suspect the overall trends and spatial coverage won't change that much (with the potential exception of Antartica, where GHCN-M was notably weak).

    1. Zeke, ISTI, shouldn't that be more than just GHCN daily? An international joint effort to bring forward the best data available, or so..
      I am slightly disappointed with GHCN v4 in my country, however. There is still no coverage in Northern Sweden for instance, and I know that there are 8 stations in the area with almost unbroken records going back to 1860. GHCN v4 does not have the best we have, but I guess that it doesn' matter from a global perspective..

  3. The map doesn't work on my old laptop but I get a still view centered over Africa on my ipad.
    Anyway, GHCN v4 is from 19 Oct and includes several countries in Africa, South America and Europe, that are not yet in v3, which hasn't updated since Oct 11.
    Thus, Giss, NOAA and TempLSgrid may increase by 0.02 when V3 updates to the country coverage of v4.

    All new stations in sparsely covered regions are welcome, especially in the rapidly warming Arctic, but the redundancy in the USA for instance, doesnt make it easier to calcucate global averages..
    More stations in the Arctic may protect against "unfair" GHCN down adjustments, by supporting each other, proving to the pairwise comparison algorithm that the warming is real.
    For instance, Svalbard Airport is very lonely in v3, and get cooled by GHCN adjustments. I know that there are at least three other stations long enough to have 30 year normals in Svalbard, and they show large warming as well.
    Nick, can you see if there are any new v4 stations in Svalbard, and if they have saved Svalbard Airport from cooling adjustments?

    1. Olof,
      Three stations are in Svalbard in September: NY_ALESUND, BARENCBURG and SVALBARD AIRPORT. I haven't looked at the adjusted version yet.

      Sorry about the viewing problems. I did mess it up about two hours ago, now fixed.

  4. I've just done a quick calculation on the using the SkS temperature calculator in GISTEMP mode with GHCNv4 and HadSST3. The qcu and qcf files give me almost identical trends, on either 60-90N or 75-90N, for the period post 1998.

    So I think that v4 no longer includes the large downward adjustments in the Arctic. I need to calculate the station trend difference maps to be sure though.

    I agree that for most of the planet the main benefit of the extra stations will be for homogenization rather than temperature calculation.

    1. Kevin,
      Interesting. I'll check out qcf tomorrow.

    2. OK, I'm guessing an increase in the 1998-2014 trend of 0.015C/decade versus an out-of-date GHCNv3.2 from Feb of this year. Of that, only 0.005C/decade comes from the Arctic. That's a bit less than I expected.

    3. Arctic adjustments since 1998 look much more sensible:
      Link to map

    4. Yes, that looks sensible with no major adjustments north of 70th degree N. Thats very much in contrast to the GHCN v3 adjustments as shown by figure U6 in Your paper

      Hence, GHCN V4 seems to have solved the Arctic cooling bias..
      What about the other areas cooled by GHCN v3, eastern Sahara and Amundsen-Scott? (Giss have elegantly avoided the latter by picking unadjusted data directly from SCAR)

    5. Nice work, Kevin. It looks like the homogenisation is working properly. Mostly the neutral green colour plus som warm and cold dots here and there, but seemingly in a balanced way that shouldn't affect global means. A possible exception is central Mexico with lots of cold blue..

      I noticed that BEST has wakened from the slumber and produced data through September. It gives a similar picture as the other globally infilled datasets, it was slightly warmer back in February-March than in recent months.
      However, all gridded datasets are breaking records in September sofar, but I don't think Hadcrut4 will do so, since Jan 2007 is a tough one to beat...
      On the other hand, October will likely swipe out all old monthly anomaly records. Reanalysis data is running really hot, and will likely finish about 0.2 C higher than September..

  5. Thanks, Nick Stokes, for your good work.
    Your data visualizations are excellent already, but please keep up improving them.
    I have placed links to the Moyhu website in my climate and meteorology pages.

    1. Thanks Andres,
      I've written here about how WebGL can be used graphically. I hope it will catch on.