Wednesday, July 17, 2019

Comparison between global temperature indices following GHCN V4; changes since 2015

I had noticed that recently the concordance of GISS with the more advanced TempLS methods seemed to have improved, and I wondered whether there might be a general improvement associated with the adoption of GHCN V4, with the big increase in land stations. In 2015, I posted a study of the extent to which a rather large set of indices mutually agreed. It included land, SST and troposphere measures. I may revisit that. But for the moment, I want to look in a similar way at just the surface (land and ocean) indices. Since they seek to measure the same thing, differences can be attributed to method rather than physics.

In that earlier post, my measure was the standard deviation (sd) of differences between monthly index values over the most recent 35 years. That was to fit with the satellite data, which is not used here. But I will stick with the period (updated), because while there doesn't seem to be much sensitivity to the choice, I want to concentrate on method differences rather than data, which might diverge at longer times ago. Data sources are listed here. The sd measure is not affected by different anomaly bases.

So here, as an overview, is the current set of standard deviations, according to the color scale in the key on the right. Red means better agreement.

The best agreement is between the various methods of TempLS, as described most recently here, with an overview here. It is so much better that I have used a color out of the rainbow scale to show it. Differences of the three advanced methods have sd's of about 0.01°C. That is about a third of the nearest difference between indices from different sources.

The next best agreement (0.027°C) is between TempLS grid and NOAA Land/Ocean. I commented,also in 2015, on how NOAA and TempLS grid were eerily close; I showed comparison graphs. That closeness has persisted, and is a reason why I keep posting TempLS grid, which I otherwise think is a very primitive method. So the fact that NOAA is so close makes me worry about that index. But anyway, it is now even closer. I have modified the grid method to use a cubed sphere mesh, which I think is much better than lat/lon.

Almost as good is the agreement between GISS and the advanced TempLS methods. As I shall show, this has improved with V4. TempLS LOESS and Infill have lowest sd, at about 0.031; mesh is a little more at 0.035°C.

The five non-TempLS indices are shown in the top corner. Their levels of agreement are much lower. The Cowtan and Way kriging index has an sd of 0.45 with both GISS and BEST, but less agreement with NOAA and HADCRUT. The best agreement (0.039) is between HADCRUT and NOAA; these have always seemed to act as a separate grouping. GISS and BEST agree about as well (0.045) as the do with C&W. BEST has the greatest disagreement, with both NOAA and HADCRUT.

I posted the data back in 2015, so now I'll use it to show how these concordances have changed. In the following plot, the current sd is divided by the sd reported in 2015. A red value indicates reducing sd (improvement). TempLS LOESS is omitted because it did not exist in 2015.

The biggest changes are associated with TempLS, where methods have improved, particularly with Infill. In 2015 this was a heuristic method, which seemed to give a large improvement. But now I solve a diffusion equation to convergence, which seems to be better again. The sd with GISS is about halved, and is, by a hair, the best agreeing of any TempLS. Because it shifts further towards the other advanced TempLS methods, it moves away from the grid method, and so also from NOAA, which shows as decreasing agreement. The improved agreement (4x) with TempLS mesh is the greatest change of all.

The other marked changes are with BEST. 2015 was still fairly early in its life cycle, and most noticeable is the increasing disagreement with NOAA and HADCRUT. But it also doesn't agree with anything very well.

The other indices, interacting with each other and with TempLS mesh, show little change. T mesh was stable over that period. There is some deterioration of agreement between HADCRUT and GISS, which could be due to the introduction of ERSST 4 and 5, which adjust for the introduction of drifter buoys in SST measurement. HADSST is just bringing out V4 which may implement that.

Here is a more detailed quantification of the changes. There are 9 plots, showing for each index the sd's of the differences with the others (green). Overlaid in transparent blue is the corresponding sd from 2015. For TempLS LOESS, I have used the 2015 sd's of TempLS mesh. Use the arrows below to cycle through the plots.
In the first plot (GISSlo) the TempLS advanced indices (TM, TL, TI) show best agreement, and also improvement (faint blue is 2015). Agreement with HADCRUT is worse. Of the other plots:
  • HA HADCRUT - almost everything is worse, especially BEST. The best agreement is with NOAA and TempLS grid.
  • NO NOAA - not much change, except for lower agreement with BEST. But not a high level of agreement.
  • BE BESTlo - again much increased, and high, disagreement with NOAA/HADCRUT. Otherwise small changes toward more agreement.
  • CW Cowtan and Way - much improved agreement with TempLS; fair agreement unchanged elsewhere.
  • TM TempLS mesh - good and improved agreement with GISS and TempLS grid. Very good agreement with TempLS LOESS and Infill, with Infill much improved (due to Infill method improvement).
  • TL TempLS LOESS - as for mesh. LOESS did not exist in 2015.
  • TI TempLS Infill - very good and improved agreement with TM and TL. Also improved wrt GISS and CW; HA, NO, TG somewhat worse.
  • TG TempLS grid - mostly substantially improved, and not bad, except for BEST and CW. Slightly worse relative to BEST and TI. The good, and further improved, agreement with NOAA has been noted.
Overall, I think it is important to note that even the worse disagreements are not so bad - about 0.075°C. There is a marked tendency to clump, with HADCRUT/NOAA/TempLS grid as one group, and GISS+TempLS(TM, TI, TL) as another, with BEST and CW more loosely attached.

To put the size of these differences in context, they range from 0.01, which I called very good, to about 0.075, which was about the worst. But I did a quick similar analysis between HADCRUT, UAH and RSS. The result is here:

The best agreement there is between the satellite measures, as about 0.1°C. Agreement between surface and satellite is in the range 0.125 to 0.145°C

I have posted the data for this post on a zipfile, with readme.txt, here.


Post a Comment