Sunday, December 11, 2016

Storing data for the winter.

I've been squirreling data away, and posting here, even before the current threats to US Science. But a post by Tamino prompted me to get a wriggle on. In particular, I decided to systematically post the files of monthly data that I use for graphs and tables on the latest temperature page. They are tables looking like this, but going back to 1850. My idea is to update weekly, and store monthly versions, so one can refer back. I have started doing that; there is a CSV file and a R data format file, with a readme, zipped. The URL for this month is
https://s3-us-west-1.amazonaws.com/www.moyhu.org/data/month/indices-Dec-2016.zip
and it will change in the obvious way for future months. I'll probably put a table of links on the portals page, once a few have accumulated.

I'll look for other opportunities to back up. I'm actually more dependent on current data like GHCN, ERSST and AVHRR data, and the reanalysis. I'm more optimistic there, because that data is collected for weather forecasting, which has powerful clients. So while access may be bumpy, I don't think it will be lost.

10 comments:

  1. GHCN shouldn't be a problem - we can always switch to v4, which comes from ISTI. ISTI is largely unfunded, but if the US were to shut down GHCN they would have a very strong case for EU funding under the H2020 programme or similar.

    The GHCN homogenization code is fairly easy to run - I've done it.

    The GISTEMP code is runnable, and the Clear Climate Code version is easy. But if it were under threat I expect U Columbia would pick it up.

    ERSST is the most likely problem - I haven't seen the code and don't know how much CPU it needs. Also, ICOADS is a dependency. The GTS streams, which are not public for naval piracy reasons, will probably be unaffected, but will still need to be stripped to produce the ICOADS summaries. WMO could presumably pick this up easily, or a national meteorological organization.

    If ERSST were unmaintainable, GISTEMP could switch to COBE-SST2 once it becomes operational. That would keep GISTEMP independent of Hadley. In extremis I could run GISTEMP, but it won't come to that.

    Berkeley will continue, and is probably developing faster than the traditional records.

    So at this point I'm not unduly concerned they we will lose GMST records.

    ReplyDelete
    Replies
    1. Thanks, Kevin
      You've encouraged me to look at the homogenisation code.

      Delete
    2. I'd suggest grabbing my code from here as a starting point:
      http://www-users.york.ac.uk/~kdc3/papers/homogenization2015/

      When I say 'fairly easy', it was easy to build and I eventually ran it without any help. However to configure the inputs the docs were uninformative - I had to read the comments in the scripts, which were much more useful.

      Delete
  2. A slightly off-beam thought but … there was a lot of reporting of Harper's Government in Canada doing irreparable damage to climate related science records (science note books etc.) under the 'cover' of rationalising libraries. Has the current administration initiated a review into this? Has the damage been assessed and reported on? Will anyone be brought to account? I think that those that commit acts of vandalism against scientific institutions in terms of their records, systems and human resources, ought to expect to be brought to account in due course.

    ReplyDelete
    Replies
    1. I'm not really fearing a concerted effort to destroy records; I don't think it is really feasible now anyway. I'm thinking more of sand in the works of publication. Funding cuts, sackings etc. As Kevin says elsewhere, other organisations can probably pick up the task, in time. But there would be gaps.

      Delete
    2. … and some unique assets, like GRACE. But the partnerships often involved with foreign institutions limits ability to attempt to wipe out core research. More likely is that like under Bush, there will be censorship of the communications.

      Delete
  3. I think it's a good idea to save and compare data over time regardless of what political persuasion is dominating the system. One of the things I don't like about the GHCN-based homogenization approach to viewing climate and climate history is the constantly changing history. Every single month. It changes, albeit by usually very small amounts. But changes occur throughout the past estimates, so the past is always a moving target. In my view, this is absurd. The past does not change and should not be changing every month. I don't have a problem with making major updates once in awhile that are based on a new approach to modeling the past estimates, provided the new approach is well documented publicly and makes good scientific sense. We should still have data and output from the older approaches/models to compare. I view these comparisons as a way to help at least partially assess some of the uncertainties involved in global surface temperature modeling efforts.

    ReplyDelete
    Replies
    1. Bryan,
      I had been intending to do the monthly storing anyway. I had been putting the current file online, so it's really just a matter of organising names.

      I think it isn't sensible to regard GISS etc as custodians of history. They aren't the holders of the original data; that is really met offices, with GHCN doing some library work. They (like Moyhu) are just trying to provide an ongoing estimate of a derived quantity - the global average - and their estimate should vary as they think of new ways of doing it, or just gradually acquire new bits of information. The TOBS adjustment, for example, happened gradually over years, because finding the records on obs timing took time. They are probably still finding some. There isn't huge funding for this work.

      Another example is the much complained of adjustment to SST (Karl) which was done as data accumulated on calibrating ship to buoy readings in the field. The calibration required accumulation of instances where ships and buoys were measuring in the same area. That took years, and may even now change with further data.

      It doesn't seem to be an issue in other areas. I don't see people complaining that an estimate of unemployment in 1989 was different in 2010 from what it is now.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Nick, thanks for your time. I always appreciate your insight. One of the things I like about the reanalysis approach is that the resulting historical estimates are very stable and don't change every month, unlike the GHCN-based homogenized estimates. However, there have been some major updates similar to what I mentioned above as acceptable, for example in January 2002 for ERA40>ERAI and in April 2011 for CFSR>CFSV2. I'm still trying to learn more about these step changes to make better sense of how to interpret/compare the historical estimates and trends over the entire range. The Copernicus web site states: "Values over sea prior to 2002 are further adjusted by subtracting 0.1C. This accounts for a change in bias that arose from changing the source of sea-surface temperature analysis." I calculated that the ERAI adjustments raised the 1979-2015 global surface temperature anomaly trend from 1.20C/100 years to 1.68C/100 years based on 1979-2015 ERA data that appears to be unadjusted, as provided by UM CCI, versus adjusted data for the same period provided by Copernicus. I have not yet found an in depth explanation of this "change in bias". When I look at the adjusted minus unadjusted differences I see an abrupt upward jump of about 0.05C at 2002, with smaller changes across the entire period such that the adjustments for 2015 are about 0.1C higher than those for 1979. Could this SST analysis change and subsequent bias adjustment simply be more of a confirmation bias change/adjustment selected because it produces a faster global temperature rise more in line with what they expect?

      I suspect the CFSR to CFSV2 changes may have similar issues but have not fully investigated. There is a sudden drop in the NCEP CFSR estimates in early 2010 relative to both the NCAR CFSR and ERAI estimates, but the change to CFSV2 did not occur until April 2011, where the offset continues but does not change appreciably. There is no sudden offset beginning April 2011 as might be expected. Very puzzling. I downloaded NCEP/NCAR CFSR monthly and annual global surface temperature estimates for 1948-2015 provided by UM CCI that appear to use a consistent methodology throughout the period, but at lower resolution than CFSV2. I compared the monthly estimates to monthly averages of the NCAR CFSR daily estimates you provide here and they are very similar, but slightly different (not sure why - maybe in how the global estimates are determined from the gridded data). When I normalize the ERAI, CFSV2, and NCAR CFSR output to a reference period of 2011-2015, they all match very well since 2011, but diverge going back prior to 2011. It's like the precision is good recently, but the calibration drifts going back in time.

      From what I recall reading, the Karl SST adjustments involved adjusting buoy measurements to match ship measurements. But that seems backwards to me. It makes more sense to me to adjust the ship measurements to match the buoy measurements, since the buoy measurements are generally better quality. I also don't understand why the GFS apparently uses OISST2 that includes satellite sea surface temperature measurements, while the CFSV2 reanalysis apparently uses ERSST4 that excludes satellite estimates (from what I recall). I say "apparently" based on what UM CCI shows for current SST used with the GFS and archive SST which I assume is used with the CFSR, so I am not certain about it. But if this is true, that seems like a poor choice to me to leave out the additional coverage of satellite SST estimates for the final reanalysis. The whole purpose of the reanalysis should be to incorporate as much valid data as possible for input into the atmosphere/ocean/land reanalysis weather/climate model.

      Delete