Purpose of weighting.I'll illustrate this with the conventional scheme for creating cells by 5°x5° lat/lon blocks:
You can see that the stations are unevenly distributed. This is a land/sea analysis using the new CRUTEM3 data set with HADSST2, and the stations marked are those that reported in April 2009. The "stations" in the sea are gridcells for HADSST2, which is also on a 5°x5° grid.
It would be most undesirable to include the stations unweighted - then areas that happened to have many stations reporting would unduly influence the estimate of what is supposed to be a global average. In fact, the usual way of getting areas to balance equally is to add up area averages, over the cells that you see.
TempLS takes a slightly different view, but functionally it is very similar. The cells are weighted according to the inverse of their area density, as well as can be estimated. And the simplest estimate is stations per cell, adjusted for area.
Empty cellsAs you can see, in any month quite a lot of cells (marked in yellow) are empty. Conventionally, they are just left out of the sum. TempLS does that in effect too. But this detracts from the purpose of weighting. Leaving them out does not mean no effect; instead, it is as if they were included, but assigned the average global anomaly for that month.
What should be done is to include them with the best estimate of their likely value. A better estimate is something derived from nearby stations. Many cells are empty in the Arctic, for example, and that leads to the relatively high trend there being underrepresented in the global average.
RemediesThe first remedy is to use a different scheme which has a smaller empty cell area. I will describe a simple one. I've described previously a more complex one. It is fairly time-consuming, though.
In V2.2 I have developed another, in conjunction with an equal-cell area division. Nearby cells are assigned extra weight to make up for what the empty cells are missing. This weight is transferred to the stations.
The solution method emulates diffusion. Empty cells transfer weight to their neighbors. Those that have cells keep the weight, but some will have gone to empty cells. So it is transferred again. About four cycles of this gets most of the weight to occupied cells.
There is a trick there. It's like explicit solution of numerical diffusion by "relaxation", and it can be accelerated by over-relaxation. In each step, about 4/3 times the weight is removed - ie empty cells go negative. But they get some from their neighbors.
ImplementationAs I mentioned in my earlier post, this is available, and I tried it for the CRUTEM data. The answers were more different than for the simple schemes. This may be correct, but I'm checking.
Anyway, in this post, I mainly wanted to show more about the empty cell issue and how a new cell scheme can help. So here are some more views of problem areas. Africa is bad - S America actually seems better covered in CRUTEM than in GHCN. But maybe April 2009 was a good month.
Somalian pirates seem to be making a dent in SST measures.
The new schemeThe simple scheme is not good near the poles. The elements get small, so the few stations that are there are assigned small weighting. The area uncovered is proportionally greater.
The new scheme retains 5° latitude bands, but divides with fewer cells per band away from the equator. So each cell looks like the 5x5 equatorial block. That's pretty good up to 85°N. Then the last circle is divided into three.
One interesting geometry fact I learnt while doing this. If you slice a sphere equally (by thickness of slice) then the sphere surface area in each slice is equal. The rind from the equator and the near circle at the pole are the same.
This scheme does not have equal thicknesses, but it makes the calculation of cell area easy. Just work out what the thickness of each lat band is (cos lat), and divide that by the number of cells in the band.
OK, so here are the pictures:
Using these cells without diffusion was method 0 of the previous post.
There are many ways a sphere could be divided, but you need a good scheme for then deciding into which cell an arbitrary point (station) falls. With this scheme, you just decide with lat band it is, and then which of the equal divisions of longitude it falls into.