It's much quicker, because most calculation can be done on whole vectors of nodes. I found that a 9800 node global calc took nearly a minute per month to make Voronoi meshing, compared with about 2 min for a complete lat/lon grid weighting calc. That slowness isn't prohibitive, but the modified method brings it back to a few seconds per month.
So as a test I decided to revisit the just 60 stations calculation. Looking back, I was actually surprised that that calc worked as well as it did, because the stations were not evenly distributed, and there was no effectively discriminating weighting for area. The usual grid-based weighting was used, but because the sparse stations were generally one to a cell, the only differentiation was based on grid cell area, which there was just a function of latitude.
In that earlier calc I wanted to stick to a simple objective criterion for station selection. This was that the stations be rural, have at least 90 years of data, and have been still reporting in 2009. That cut the number of stations down to 61.
Now I can use the weighting in the selection process too. It's still objective - the actual temperatures are not consulted. What I do is to take a larger set (I relaxed the record length to 50 years) and mesh that. There is a big range of weights, so I pick the top two thirds (approx), the stations covering the largest areas. Then I remesh, and select again, and then repeat (stages 170, 120, 90, 60). This gets down to 60 stations, but more evenly distributed. And they are now properly weighted by area, with no missing areas.
Here is the plot of the initial selection and mesh, from 4 perspectives. The circles show (by area) the weights attached:
If it looks ragged around the rim, that's because I plotted only triangles where all nodes could be seen - otherwise there are end-of-world effects. The faint lines are the underlying mesh, and the dark lines show the area assigned to each station in the weighting.
Here are the stages of concentration, seen from N America.
And here is the final stage, from other perspectives:
and a lat/lon plot:
For comparison, here is the original 61-station selection:
Finally, here is the time series of numbers of these stations reporting in each year:
ResultsFirstly. for comparison, here were the results of the original study compared with CRUTEM3 (land only. The observation was that the same general trend was followed, with of course more noise in the 61-station sample.
Now here is the new plot compared with CRUTEM3 and with GISS Land/Ocean (all to base period 1961-90): (Update - corrected following Anon comment below - the graph I originally showed is here)
It's a lot less noisy. It is also rather closer, in both trend and yearly, to GISS Land/Ocean rather than to CRUTEM3. This is interesting, because although it has 60 land stations only, they are weighted for total global coverage, and with islands heavily represented. This suggests that the difference between the two data sets may reflect weighting rather than the kind of measurement.
NextI think the smoothing achieved with refinement by weighting worked rather well. I'll look at rationalising larger subsets that way.
I also want to look at schemes for avoiding monthly meshing. Creating these rather larger cells around a few stations means that the stations could be weighted not by individual areas but by their share in larger areas. Since several stations would inhabit such cells, frequent remeshing would be unnecessary - only required when there were no stations at all in a cell.