I've been spending far too much time lately on triangular meshes and Voronoi tessellation. The idea for combining temperature measurements is to weight stations according to the area they could be said to "represent". This is important in Antarctica, for example, where the distribution is very uneven, and the Peninsula is likely to dominate just because of the number of stations.
For the global indices, this is handled indirectly through gridding. A cell average is calculated, and cells combined, usually weighted by their easily calculated area.. There's a problem, usually unresolved, with empty cells. And this is one reason why I'm attracted to Voronoi tessellation. Every cell has a reading, provided you are prepared to create a new tessellation every month.
I experimented with a simpler variable cell scheme last year. But there were still empty cells.
A Voronoi tessellation just divides the space up according to the nearest nodes. The patch around node A consists of points nearer to A than any other node. There are still some practical problems in getting it into TempLS, because the tessellation would have to be done for every month. So it has to be reasonably efficient, but more importantly, done without any intervention to fix things. In this post I'll describe my experience so far.
Voronoi tessellation follows from triangular gridding. If you can form a triangular mesh connecting the nodes, then the bisector of each line is likely to be a cell boundary. It's not certain, because one of the nodes in the adjoining cells could be closer.
I found mesh and Voronoi routines in the R packages geometry, alphahull and tripack. The latter seemed the most used, and is called by alphahull. The key routine is called delvor, and does both. However, I found it advisable to run the mesher tri.mesh() first, because it had a flag for removing duplicate nodes.
Here is a Delaunay mesh of Antarctic stations. I've marked a quadrilateral to explain what Delaunay means. It is usually expressed as a requirement that the circumcircles of each triangle include no other nodes. I find that hard to visualize - a simpler version is that in each quad like that marked, the angles opposite the central line add up to less than 180. That gives the simple algorithm for converting any mesh to Delaunay. Since the sum of angles in the quad is 360, if the angles added up to more than 180, you could flip that central line to cross the other way, and the sum would then be less than 180. Of course this could disturb the property for some other quad, so the process can go on for a while. There are more efficient algorithms, but for small meshes like this, flipping is OK. In fact, you can make it quite efficient by doing an occasional sort.
The mesh includes quite a lot of ocean. That's because tri.mesh, and everything in R that I found, insists on meshing the convex hull of the region. The convex hull is what you would get by running a taut string around the points. I did my own experimenting to try to generate a mesh where I can chip away to make the mesh better follow the land. I'll probably be able to do this with the tri.mesh structure. But there's the problem of doing it mechanically so it can be done every month as stations reporting change.
As I said above, the Voronoi tessellation is formed by drawing the perpendicular bisectors of the lines, and surrounding each node by the inner envelope. The Delaunay property means that you can be fairly sure that this means just connecting up the circumcentres. But there is a remaining problem, as you'll see from this result:
Although the circumcircles can't include other nodes, near the boundary, this isn't much of a restriction, and the Voronoi lines go a long way. I want to use the areas for weights, so something has to be done. Oddly, I couldn't find anything in R that confines the cells to the meshed region. So I added my own code to cut off at the boundary, which gave this:
So here, finally, is a map showing the weightings produced (by area of circle). A problem is that it is rather uneven. On the peninsula, say, just one or two stations get connected to a large area. I think something will need to be done to fix this by averaging - not to upweight the region, but to avoid suppressing some stations completely.
This effect is somewhat exaggerated because I've included all stations that have ever reported; in any given month it will be sparser.
I have posted the code here (Antvor.zip) on the doc depository. Antvor.r is the executable; idx.r is the set of station data which comes from Ryan's code.