Thursday, January 28, 2010

GHCN Stations warming?

Update Fri Jan 29 2.56 pm (East Aust Time) Big Oops. Programming error.

It is surprising that such a simple program could have a programming error. But there was a memory overflow, and it had a big effect on the results.

With that corrected, there is indeed a rise in station mean temps. The new graph is here. I should emphasise that the plot is of "naive" means - just averaging all readings for each year, including duplicates.

The new plot is like that on p 14 of the d'Aleo/Watts report, and different from that on p 11.

The revised R code I used to calculate and plot this example is here

Earlier post:

The new report of d'Aleo and Watts, trumpeting calculations of E.M.Smith, makes much of a supposed shift of GHCN stations to warmer areas as an alleged source of warming. Indeed, it is full of accusations that this is done with fraudulent intent.

Of course, anomaly calculations wouldn't show warming for that reason. But is the station set actually warming?

I did a simple calculation. Just the average temperature of all stations in the GHCN set v2.mean, for any year. You might expect a small rise reflecting global warming. But if there is nett movement of stations to warmer climes, that should show as a bigger effect.

Here's the result, plotted from 1950. The trend is actually down.

The R code I used to calculate and plot this example is here

Update 29 Jan. I investigated the downspike about 2006. It's caused by some stations returning a lot of missing months, which yields erratic results. So I put a screen in the program requiring at least 9 months of data before a station could contribute to the year's average. It didn't change the overall picture, but did eliminate the spike at 2006.

Following a suggestion of Carrot Eater, I checked the v2.mean adjusted file. The results, not very different, are here .


  1. Even when they consider the totally wrong question, they still get the wrong answer.

    This analysis is so meaningless, it's depressing you felt it necessary.

  2. Nick,

    You might find these interesting as well:

    Source code is at

  3. Zeke, with that number of stations and without spatial averaging, I'm a little surprised that second plot came out looking so good. I'm feeling too lazy to look myself; I'm wondering what the distribution is between NH and SH for the high latitude discontinuous subset. Polar amplification is more obvious in the NH, after all.

  4. Carrot Eater: here you go. The first number is # northern stations, the second is # southern, all in the discontinuous set (e.g. cover 1960-70 but stop reporting between 1970 and 2000) and >= 60 degrees latitude.

    1960 16 12
    1961 16 11
    1962 16 11
    1963 16 12
    1964 16 12
    1965 16 12
    1966 16 12
    1967 16 12
    1968 16 12
    1969 16 12
    1970 16 12
    1971 13 12
    1972 13 12
    1973 12 11
    1974 11 11
    1975 11 11
    1976 11 10
    1977 9 9
    1978 9 9
    1979 10 9
    1980 9 7
    1981 7 5
    1982 6 5
    1983 6 5
    1984 7 5
    1985 7 3
    1986 5 2
    1987 3 1
    1988 4 2
    1989 5 3
    1990 4 3
    1991 4 3
    1992 3 2
    1993 5 1

  5. Thank you zeke. I guess that's about what I expected. If the balance between NH and SH swung more rapidly over time, I'd expect some minor spurious trends in your simple mean due to the NH warming more strongly. But the hemisphere distribution looks stable enough to avoid that.

    I'm surprised your method works as well as it does. I'm trying to think of an easier alternate way, short of calling up the individual countries and asking them to send in data for the stations they don't send monthly CLIMAT reports for.

    Nick, congratulations on setting up a blog. I look forward to more.

  6. Zeke,
    Thanks for those plots. It's odd how it seems that, as CE says, the "selecting warm stations" meme is not only unverifiable, but often seems the opposite of thetruth.

  7. I see the Watts/d'Aleo/Smith publication actually does produce the meaningless graph above, though they get a very different-looking result. They don't number their figures, but it's on page 11. Any idea what they did differently?

    Interestingly, they cite a rather dated paper from Willmott (1991) on possible artifacts in spatial averages. This also works with absolute temperatures, instead of anomalies. I'll have to take a closer look to see what those authors did.

  8. I'm glad to find this blog. I think a blog that is mainly about the Watts Surface Stations project is very useful, especially as Watts and his moderators say there is a lot more coming.

    Now, WUWT has made me all excited before about stuff finally driving that last nail in AGW's coffin home, but up till now every attempt has disappointed me greatly (I of course hope AGW really is a scam and would welcome any conclusive evidence, which the molehills camouflaged as mountains obviously are not).

    Would it be a good idea to give the blog a more prominent name, such as StationAudit or you name it? It'd be good if something is already there when the Pielke/Aleo/Watts paper comes out that can evaluate its strengths and weaknesses. People like me would immediately know where to look (instead of waiting for all the scattered blogs and websites to respond) for a review.

    OTOH, it will be a lot of work. I don't want to be one of those people who asks other people to carry out his ideas. If I'd have the brains and the expertise for it I would do it myself. :-)

  9. Carrot Eater,
    Thanks, I missed that. I don't know for sure what the cause of the difference is. The source seems to be Ross McKitrick, as quoted, and he gives an Excel spreadsheet, which I'm looking into. He doesn't use v2.mean, but some data scraped by Joe d'Aleo from the NOAA site. It may be adjusted data. I'm not sure if it's available to others. He then weights categories (urban, suburban and rural) in an unexplained way.

  10. Well for the fun of it, you could repeat the above using v2.adj, and see how that looks. I'm not sure what meaning it would have, but perhaps it would have some resemblance to their plot.

    I'm puzzled that they were able to compile page after page of simple means of absolute temperatures, and not stop to wonder why nobody averages together absolute temperatures.

  11. Carrot Eater,
    I've looked at Ross' Excel file, and tried to track down the source. For some reason, the Excel file starts from data averaged over the three categories, which is why he then creates a weighted sum in putting them back together. There's no actual station data in the file - just averages over these three categories. This makes me think that maybe those averages were what he had got from Joe - otherwise I don't see why the roundabout process of recombining to get the all-stations average.

    I tried to track the NOAA refence given, but it starts at this general level : When I go further, looking for data that could be found as described, the only plausible page seems to be, which tells me:
    Global Climate at a Glance
    is undergoing maintenance to improve its function and update its data feed. A timeline for its return will be posted on this page.
    Last Updated Monday, 26-Oct-2009

  12. Carrot Eater,
    I had the data handy, so I did that. The trend for the adjusted data is still down. The homogeneity adjustment does seem to level things a bit, and there is a more prominent uptick at about 1990, but it's temporary. The plot is here.

  13. Thank you Nick. And I apologise for my laziness in not doing these things myself, as this one was especially straightforward.

    Good idea to reject years with too many -9999s.

    It is a little bit weird that they did this analysis in such a convoluted way.

    But I don't know if it's important to really track down how they did it, since it's not a meaningful graph anyway.

    The main question is why they persist in averaging together absolute temperatures, instead of anomalies?

  14. Nick, thanks!

    (1) For doing the analyses
    (2) For putting up the R
    (3) For linking this post to the Air Vent.

    I have no idea, yet, as to whether there's merit on this issue. But I'm opimistic that I'll learn more by reading what you post and what Jeff Id posts.

  15. Do you know what causes the spike at 1990?

  16. Neven,
    Thanks for your comments. On blog names, my choice was really to find something that was short and easily searched. I was surprised how many blog names on blogspot were already taken. But is it really so that people find blogs by searching likely names?

    One of my motivations is indeed to have a resource for responding in some technical detail to stories that arise on WUWT etc. I'd be very glad of any help there.

  17. OK, I had a look. At this time,

    just leads me (eventually) to the main data page of the GHCN,

    It looks like the graph was made in 2000. As of now, I don't think there are separate files for urban/suburban/rural; you have to cross-reference v2.mean against v2.temperature.inv to find that. Maybe things were different back then.

    However it was made, it's a meaningless graph.

  18. Amac,
    Thanks for your comment. Yes, I'm sure the discussion will continue.

  19. jae,
    One could put that as what caused the dip in the early 80's. But yes, it could be a blip associated with the changeover from the historical data to the recurrent reporting. But it isn't maintained.

  20. Carrot Eater,
    Yes, according to the metadata, Ross M's file with that plot was created on 03/23/2002.

  21. Carrot Eater,
    A curiosity here is that there are plots of apparently the same numbers on p 11 and p 14 of the report. But both the temperatures and the numbers of stations are different (and the temperatures different to mine). The first plot has about twice as many stations.

  22. Nice job Nick. That's the second time recently you've been up front with a mistake. It sucks, having made plenty of my own. I'm impressed.

    Jeff Id

  23. Thanks, Jeff. But I hope there won't be too many more. And I do now see the difficulties of trying to program in blog time.

  24. Ah, so you made the wrong plot incorrectly as well. Oh well.

    In any case, I really don't care why your plot is different from p11, or why that is different from p14.

    Basically, if one guy divides 5/0 and gets 45, and another guy divides 5/0 and gets 98, it isn't interesting to me how they got those numbers; it's simply enough to know that they shouldn't have divided by zero in the first place.

    Most of what Watts/d'Aleo/Smith are presenting are just elementary misunderstandings of anomalies, baselines, trends and the history of the GHCN. It's better to focus on those directly.

  25. Im joining the GHCN party late (in fact I haven't quite arrived). Hope there is still food left when I get there.

  26. Don't worry, I think GHCN will come and go, and your maps will be very useful. For example, stations currently reporting, and, say, those that have reported since 1997.

  27. Here is what you asked for. A map of all stations that have reported since 1997 (although I did from Jan 1998 onwards) and since 2009, as wells as some in-between years:

  28. blob,
    congrats on the blog. i don't have a profile in anything fancy. is there a way i can put a comment there?

  29. TB,
    Thanks, those are good maps. The post-2008 one is especially helpful. It's clear that the big reduction was in the US, which goes from highly oversampled to just OK. But this may be a USHCN v2 issue.

    Placing the dots on a map woth country boundaries might be interesting. Thgere's clearly a lot of countries in Africa that don't report at all (as well as Bolivia).

  30. Well, I'll just say my comment here then, for blob.

    I think your map is fantastic. That is value added, beyond most other efforts I've seen, including my own. I agree a map with national borders would be good.

    Bolivia is small enough that I don't care much. It's more the empty spot in (Angola/DR Congo?) that is of concern to me. Though I wouldn't expect much data to come from DRCongo, given the state of the place.

    Could you do a version of the map that gives an idea of stations that give reasonably high frequency of reports? Meaning, can you show a map for stations that have given at least 7 months in each of the last 30 years, or something like that?

  31. I changed the comment rules to allow anonymous comments. I ran the query you suggested and it shows a gap in Africa as you suspected. Canada is also showing a gap too. I think a better kind of graphic is needed to visualize changes over time. A map can't really show which months and years the stations are missing data.