Sunday, March 6, 2011

Voronoi weighting of temperature stations

Last post, we had a Voronoi weighting, but restricted by being able to only to convex shapes, and with fine scale weighting variations that seemed arbitrary, and likely to increase variance. Those have now been fixed, and I think area weighting can go ahead.

The revised code is here (in, and is more structured.


The problem here is that a routine for meshing a set of points does not have any indication of the shape to follow, and so by default follows a convex path. That helps too because it is easy to tell if a point is inside a convex boundary, but not otherwise.

The code now allows the user to take convex bites out of that convex shape. That is just what we need for the Weddell Sea, for example. Here is what the tessellation looks like with that done. The axes show distance in km. The Voronoi polygons used for weighting are shown darker, with the lighter lines showing the underlying mesh. The chunk taken out for the Weddell Sea cuts the weighting polygons - the lines still remaining are for the mesh, which does not affect the weighting.


You can think of area weighting as a space integration formula. You have something measured at a set of points. To get an average over the whole region, you approximate its integral, and divide by the area. A simple integration process is to divide into small areas, multiply each area by the estimated value, and add. The estimated value is that provided by the station which we have ensured will be in each area.

So we have got that part, but now we're losing information from a lot of stations because they have been associated with a small area. Most of that is inevitable, because of the general disparity in distribution. But on a small scale, where we expect stations to be correlated, it makes sense to use neighboring stations to get the estimate of the area values. To put this another way - to smooth the weight function locally.

I did that by adding a diffusion process. For about 5 steps I applied the rule that stations connected directly in the mesh would exchange part of their weights to even up. The amount exchanged depended on the separation, and I used a gaussian function, with a scale factor of 100 km. This gave a diffusion scale of up to about 200 km after 5 steps.

Before smoothing.

After smoothing.

There isn't much difference visible, because the large scale is left untouched. But looking at a region of West Antarctica where the weighting had been unnecessarily uneven:

Before smoothing.

After smoothing.
Now it's clear that on the local scale, smoothing is almost total.

The way I think of it is that the initial weighting gives us a better estimate of expected value. The smoothing should not change that, but reduces the variance of that estimate.

Not very much, though, because the stations improved still have a low weighting, so even with uneven weights the contribution to variance was low. In other words, the region was just oversampled, and the weighting makes allowance for this.

Incidentally, I should say that in all of this I'm plotting all stations that have reported at some stage (with some screening applied by OLMC10 for short records). In any month there will be fewer.


I've been wanting to develop a Voronoi capability for the general problem of sparsely sampled regions of the globe, particularly the Arctic. Here the main issue is that the Arctic Ocean has no land stations nor SST measurements. Here is the Voronoi mesh for land stations that have reported in 2010/11:

It is clearly not ideal - we just don't have the stations. But I think it is better than lat/lon gridding, which leave empty grid cells, and leave the region underrepresented.

Next steps

Firstly application to Antarctica with TempLS. It will be interesting to see if it brings down the trends. Then the world - I've been able to get a mesh, so the tessellation should follow.

1 comment:

  1. Cool stuff. I hope you publish this someday.