Monday, August 30, 2010

TempLS V2 Math basis

The Math basis of TempLS Ver 2

For Version 1 of TempLS I posted a PDF guide, TempLS.pdf, to the mathematical basis. It's a useful starting point here.

But this time I'm going to try to do it in HTML below. Colors help.

For more Version 2 information:

  • Index of posts by category
  • TempLS Version 2 Release
  • Ver 2 - Regional spatial variation.
  • Spatial Temperature distributions in TempLS V2
  • Plotting spatial trends in TempLS

  • Thursday, August 26, 2010

    TempLS Version 2 Release

    Here is the long promised Version 2. I posted some trailers months ago:
    There are lots of new capabilities, and it's taken a while to get them into a framework suitable for general use. But I think it's there now.

    The headline feature is the new spatial capability. This is a logical extenstion of the least squares formulation. But the new organisation scheme may also be attractive. Instead of tinkering with the R script, as formerly, now you just create a user file for each job. The file should be fairly brief, giving just changes to a default user file. I'll list examples.

    Code, plus a user guide, is in a file on the doc repository. The files are:
    • TempLSv2.r - the main algorithm file
    • TempLSfnsv2.r - auxiliary functions
    • TempLSuserv2.r - the default user file (acts as a template too)
    • Some sample user files with output directories
    • TempLSprepv2.r - the data preprocessor
    • TempLSprepinputv2.r - a user file for the data preprocessor
    • TempLSv2.pdf - a user guide.
    I've put in bold the files that are called by name to run TempLS.

    Tuesday, August 17, 2010

    Saturday, August 14, 2010

    Hottest year? 2010?

    A frivolous topic, to be sure, but it will probably be discussed in coming months. Will 2010 set any kind of hottest year record? It doesn't look particularly likely, but I'm sure people will speculate. So I've produced a tracking plot. It shows the cumulative sum of anomalies for each index, for 2010, and the hottest year of each index to date. For all but GISS that's 1998; for GISS it is 2005.
    Update - thanks to a commenter who noted that for NOAA also, 2005 was the hottest year. The plot is amended, with 1998 lightly dotted for GISS and NOAA.

    Thursday, August 12, 2010

    Warming trends in the Himalaya

    Willis Eschenbach at WUWT once again found something that he couldn't believe. The Himalayas are warming rapidly. There's an IPCC claim that warming in lower areas of Nepal may be about 0.4 C/dec, and in higher areas 0.9 C/decade.

    Willis' post got sidetracked into fussing about GISS adjustments to 20 years of data at Katmandu. But a comment on the thread was right to the point. The IPCC was relying on a paper by Shrestha et al which had looked through records of 49 stations in Nepal in finding that result.

    Zeke Hausfather at the Blackboard looked at GSOD data. This is a more plentiful data set derived from SYNOP forms. It is unadjusted, but does not go as far back in time. He found there was also a strong warming trend. His analysis was based mainly on Katmandu. He also pointed out that the IPCC did not use GISS adjusted data.

    So I thought I could use TempLS and the GSOD database to look at Nepal and even beyond in the Himalaya. TempLS also allows us to look at different altitude ranges, subject to station availability. Willis found only one GHCN station in Nepal - GSOD has 12, but some with short histories. Anyway, here's the analysis. Most of the trends are since 1979; data histories get sparse before then.

    Update: Zeke pointed out a bug. My first diagnosis (below) was wrong. The problem was not in the analysis program but in the GSOD datafile, as modified by my preprocessor. The GSOD database is not as tidy as GHCN, and some data refer to stations not in the inventory. In the process of fixing that, a block of stations (including Nepal) got displaced by 1. This doesn't affect the analysis, but does affect the selection - you don't get the right stations. For countries, the effect is small because they are consecutive, so only one station is wrongly chosen. But for an altitude spec, it's more serious.

    The Nepal trends are little changed, and Himalaya increase a bit. The two Hi sets just have too many gaps for an analysis to work, so I've dropped them.

    Tuesday, August 10, 2010

    Underestimate of variability in McKitrick et al

    A new paper by McKitrick, McIntyre and Hermann is being discussed. David Stockwell and Jeff Id have threads, and there is now one at Climate Audit and at James Annan.

    An earlier much-discussed paper by Santer et al comparing models with tropical tropospheric temp observations contended that there was no significant difference between model outputs and observation. MMH say that this is an artefact of Santer using a 1979-2000 period, and if you look at the data now available, the differences are highly significant.

    In discussion at the Air Vent, I've been contending that MMH underestimate variability in their significance test. They take account of the internal variability of both models and observations, so that each model and obs set has associated noise. But they do not allow for variance between models. I said that this restricts their conclusion to the particular set of model runs that they examined, and this extra variability would have to be taken into account to make statements about models in general.

    However, it's clear to me now that this problem extends even to the analysis of the sample that they looked at. They list, in Table 1, the data series and their trends with standard error. The first 23 are models. In Figs 2 and 3 they show the model mean with error bars. I looked at the mid-trop (MT) set; in Table 2 they give the mean as 0.253, sd 0.012, and indeed, with error bars 0.024 that is what Fig 3 seems to show.

    So I plotted Table 1 as a histogram. Here's how it showed, with the mean and error bars from Table 2 MMH marked in red.

    The key thing to note is what James Annan also noted. The models are far more scattered than the supposed distribution indicates. The models themselves are significantly different from the model mean.

    Update: Of course, the error bars are for the mean, not the distribution. But the bars seem very tight. A simple se of the mean of the trends would be about 0.022. And that does not allow for the uncertainty of the trends themselves.

    Update 2: As Deep Climate points out below, that last update figure is wrong. A corrected figure is fairly close to what is in MMH's table.
    However, Steve McIntyre says that the right figure to use is the within-groups variance - some average of the se's of the trends of each model. That does seem to be the basis for their figure. I think both should be used, which would increase the bound by a factor of about sqrt(2).

    Wednesday, August 4, 2010

    Reduction of station numbers in GHCN

    As I noted in the previous post, a new post has appeared at WUWT  which talks a lot about the reduction in station numbers in GHCN that occurred between about 1990 to present. This post is based on a paper by Ross McKitrick.

    The stream of articles that advance various theories about this reduction don't take proper account of the way GHCN was actually compiled. It was initially a historical process, where in the early 90's with grant funding people gathered together batches of historic records, recently digitised, into a database. After V2 came out, in 1997, at some stage NOAA undertook the task of continuing monthly updates from CLIMAT forms. This made it, for the first time, a recurrent process.

    Update: Carrot Eater, in comments, has pointed to a very useful and relevant paper by Peterson, Dann and Phil Jones. As he says, the process wasn't quite as I've surmised. I should also have included a reference to Peterson's overview paper.

    Tuesday, August 3, 2010

    GE visualisation of changes to GHCN stations 1990-2007

    At WUWT there is a post about Ross McKitrick's discussion of supposed defects in GHCN, focussing heavily on changes in the stations in the dataset between about 1990 and 2005+. So I've made some KMZ files so interested people can see in detail what those changes were.

    This post follows three recent previous posts here about KMZ files for GHCN type datasets:
    Briefly, a KMZ file is a compressed file of data which you can read into Google Earth. You can just click on the filename in a file browser, or use the GE open facility (or Ctrl-O). When you open it, you will see a subset of the GHCN stations marked with placers (pushpins). These show (when you get close) the station names, and indicate other properties thus:
    • Color - rural stations are green, urban yellow. Orange is a small town.
    • Size. Big pins have >50 yrs data. 70% pins have >20 yrs, and 40% have less.
    • Balloon - clicking on a station gives a balloon with several data items, including years of reporting.

    The files

    You can find the files on the data repository. They are in a zip file which you can download (scroll down). The individual files are:
    • GHCN1900end.kmz, which has stations that dropped out of the database between 1991 and 2000
    • GHCN2000end.kmz, which has stations that dropped out of the database between 2001 and 2007
    • GHCN1900st.kmz, which has stations that were added between 1991 and 2000
    • GHCN2000st.kmz, which has stations that were added between 2001 and 2007
    There weren't many stations added, so you might want to skip the last two files.

    Below the jump, I'll add some still pictures from GE.