## Saturday, May 8, 2010

### ## Scale of spatial correlation

I've been trying to follow up on the discussion here of whether the Hansen/Lebedeff claim of correlation of temperatures (longterm) over 1200 km is reasonable. I mentioned kriging as one line to follow.

Kriging is a method for interpolating from a rather random distribution of spatial information. The original application was to mining and borehole information.

You have a spread of readings and would like to know something about the mineral field in between. You'd probably like to know where to drill next, You want a weighted formula which takes account of the fact that the further away the readings are, the more likely they to be influenced by just random noise. You want to balance the desire for a lot of readings with the need to value closer information more highly.

In R there's a kriging routine in the package geoS. But it seems to be oriented to distribution of a single variable, whereas we typically have a time series at each point. In mineral exploration, a reasonable analogy is a core profile, and there must be stuff for that.

Or I could try just using trend as the single variable. A problem there is that we don't have a uniform trend period.

Kriging depends on estimation spatial correlation, and this is often done by fitting parameters of a variogram. This rather comes back to the weighted average idea climate people use. For example, GISS uses a linear taper weight function, held at zero beyond the bounding radius, which could be regarded as the parameter.

There are a number of commonly used functions for variogram fitting, of a generally Gaussian shape. This conical function, with its discontinuous derivative, is not one of them, and with good reason. The fitting algorithms usually involve minimising with derivatives.

Anyway, I thought I'd at least do some exploratory analysis varying the radius. Details below the jump.

I began by looking at Kansas City - surrounded by lots of land stations. I soon found that Leavenworth, nearby, had a better record, so I used that.

I looked at distances of 1200, 800, 500 and 200km. Using the GISS conical weighting, I tried to see how the interpolated temperatures matched each other, and the measured temperature.

Firstly some maps, to show how many stations are involved, at the various radii. They give an optimistic view, because numbers drop off in recent years, and especially since 2005. So I've stopped the graphs at 2005.
 1200 km 800 km 500 km 200 km

Now here are the plots, on a single graph. You'll see that they do converge, and the 1200 km is noticeably out of line with the others, but not extremely so. What is out of line is the Leavenworth measurement (in black).

What that suggests to me is that the generally idea of homogeneity adjustment has some merit. It seems clear that the central measure tracks for some periods, jumps, and then tracks some more, which is just what the homogeneity test is designed to fix.
And here are the trends:

 Trend Leavenworth 1200km 800km 500km 200km Number 1 968 480 219 35 1901-2005 -0.02 0.013 -0.01 -0.009 -0.016 Trend_se 0.03 0.02 0.02 0.02 0.03 1979-2005 0.72 0.24 0.26 0.19 0.09 Trend_se 0.22 0.13 0.16 0.16 0.19