moyhu: Better gridding for global temperature

Tuesday, September 22, 2015

Better gridding for global temperature

Computing global temperature is an exercise in spatial integration with scattered data. I have written a lot about it previously, eg here or earlier here. A spatial integral is a weighted average, so it comes down to calculating the weights. With TempLS, I first used a grid method, as is traditional. Then, to overcome the problem of empty cells, I used an irregular triangular mesh, as in finite element integration. I have also developed, and will soon describe, a method using spherical harmonics. I think the later methods are better. But grids also have some advantages, and I have long wanted to get a rational infilling basis.

Numerical integrtaion

Integration is usually defined as a limiting process, whereby the region is subdivided into finer and finer regions, which can then each be evaluated with some local estimate of the integrand. There is theory about whether that converges. With a finite amount of numerical data, you can't go to a limit. But the same idea applies. You can subdivide until you get a result that seems to depend little on changing the subdivision. Sometimes that won't happen before you run out of data to meaningfully estimate the many subdivisions. That's one reason why temperature anomalies are important. With absolute temperature, you would have to divide very finely indeed to be independent of topographic variation, and there just aren't enpigh reading locations to do that. But anomalies take out a lot of that variation, making practical convergence possible.

You might ask - why bother with different methods, rather than finding just one good one? The answer is with this idea of reaching an invariant zone. If we can find an integral estimate that agrees over several different methods, that will give the greatest confidence in the result.

Grid considerations

With gridding, you can choose a coarse grid so that every cell has data. But then the data may not be a good estimate of the whole grid area. You lose resolution. A finer grid will start to have cells with no data. Traditionally, these are just omitted, meaning in effect that they are assumed to have the global average value. This was improved by Cowtan and Way 2013, using kriging. I proposed a simpler approach using latitude band averaging, which gave some of the same benefit. In this post I'll look at upgrading the infill process, using numnerics similar to solving the diffusion equation. It tries to find a local average to use for each missing cell.

Improving on lat/lon grids

To do this, I need a better grid system than lat/lon. That creates a big problem at the poles, where cells become very small and skewed. The essential requirement of a grid is that you can quickly allocate a set of scattered data to the right cells, and you know the area of the cells. There are many other ways of doing this. Lat/lon is based on gridding the sphere as if it were a flat surface, which it very much isn't. You can do much better using a somewhat similar 3D object. Regular platonic polyhedra are attractive, and an icosahedron would be best of these. But a cube is more familiar, and good enough. That is what I'll use here. The cube is gridded in the normal way, with a square grid on each face. The sphere surface is radially projected onto the cube.

I'll give details, with the infill process, and tests of the improvement of the results, using spherical harmonics, below the fold. And of course there will be the usual WebGL active picture. It will show the cube grid projected on the sphere, and infill for a typical month, with lines to show the infill dependency.

Cubic grid

Think of a Rubik style cube, but with finer gridding. I number each face by rows, with the numbering accumulating from face to face. I use a unit sphere and unit cube (each edge from -1 to 1). So a point (x,y,z) on the sphere projects to (x,y,z)/Max(|x|,|y|,|z|) on the cube. And to map back to the sphere, just divide by the magnitude.

The cell area on the sphere, which is what is needed for area weighting, is simple. The area on the cube face cell centered at r is reduced by the cos of a view angle, which is 1/|r|, and by the inverse square of the projection, which is |r|^-2, being a total factor of |r|^-3. And all those face cells have equal area on the cube.

Diffusion

After allocating the data points to the cells by projection, I then have populated and unpopulated cells. Populated I estimate by the simple average of data in the cell. Then I check all the unpopulated that are next to one or more pop. They get the average of the adjacent cell values (cells that share an edge), and are marked populated. Then I repeat, until all cells are "populated".

If I was actually doing the integration, I would then multiply those cell average values by the cell areas and add them up. That would end up being a weighted sum of the data. For TempLS, I just want those weights. That is just accountancy. Every time one of the original cell values is used in another cell, I increase its weight.

As a matter of practicality, I limit this, so cells are deemed to inherit values from the four closest actual readings. "Closest" is determined by the stage at which they were associated in the diffusion process. For the cells that were not reached until the third or later stages, there are some arbitrary cutoffs, but the cells chosen should still be close.

In the plot below, for the TempLS data (GHCN+ERSST) for April 2015, you see first the populated cells with a colored checkerboard, different colors for each cube face. Note how the cells do not differ hugely in size, but shrink near the 8 cube corners. Stations reporting in April are marked with black dots. Then the infilled cells are marked in drab colors - grey for the first level, then brown, bluish etc. I'm using here a 24x24 grid on each face. That is finer in parts than the usual 5° lat/lon grid, and since the SST is on a 4° lat/lon in TempLS, there are gaps in sea temp as well. I've used this finer mesh to show multiple layers in the diffusion.

In the unpop (drab) cells, you see little white lines from the centres. These show the connections to the actual pop cells from which they are averaged.

Test of integration

I've tested using spherical harmonics of order up to L=5 (36 functions). Products of these should integrate to produce an identity matrix (orthonormal). So I integrate, first using the grid as if all were present. Then I integrate using only the "populated" cells for that month of April. Then I integrate a third time infilling for the unpopulated cells. For those integrations I use the weighted formulation.

I tabulate the results below, as a sum of squares of differences between the matrix of results and the unit matrix. I have broken into columns according to the L value. The number there is actually an average for products in that section, so it is in fact a power spectrum (more on this in future posts). You can think of L as rising (spatial) frequency. In any case, the test of good integration is that the numbers should be small. I have left out L=0, because that is zero by normalization.

	1	2	3	4	5
Full grid	0	0	0	0	1e-06
Infilled grid	8.8e-05	0.00029	0.001045	0.002015	0.003635
No infill	0.007632	0.027335	0.049327	0.064493	0.075291

The all grid integration is accurate to six figures. That reflects that the mesh is fine enough (24x24) to resolve these functions well. For the rest, infilling makes a dramatic improvement, by nearly two orders of magnitude, though arguably the square root should be the criterion. Both deteriorate at higher frequency, infilling at a faster rate.

Conclusion

Grid integration with diffusion infilling looks good, and may be competitive with triangle mesh. If so, this is welcome, because it manages things like land/sea separation more easily. I'll post more soon on comparisons of use in TempLS.

14 comments:

Victor VenemaSeptember 22, 2015 at 7:15 PM
A icosahedron grid would be elegant for this approach. Precomputed grids can be found here.

P.S. Promised table is missing. Typo: "thoise cell average values".
ReplyDelete
Replies
JCHSeptember 23, 2015 at 1:12 AM
Victor - the image appears on FireFox. In my experience, Safari often fails to display Nick's graphics.

OT: wood for trees is down for the 2nd day. Hopefully not for good.
ReplyDelete
Replies
JCHSeptember 23, 2015 at 2:28 AM
Should have read more carefully. When I zoomed out, an "orient" button appeared on the upper right, but can't find a table either.
ReplyDelete
Replies

Add comment

An interactive topic index for all Moyhu posts.
Latest Ice and Temperature data
Climate Data Portals
A gallery of Javascript-enhanced graphics
Temperature trend viewer
Google Maps and GHCN
WebGL map of past GHCN/SST station temperatures
WebGL map of GHCN/SST station temperature trends
HiRes NOAA OI SST with WebGL and Movie
Regional Hi-Res SST movies
WebGL Facility
TempLS Guide
More pages, and blog glossary

moyhu

Tuesday, September 22, 2015

Better gridding for global temperature

Better gridding for global temperature

Numerical integrtaion

Grid considerations

Improving on lat/lon grids

Cubic grid

Diffusion

Test of integration

Conclusion

14 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Blog Archive

Translate

Resources

About Me

moyhu

Tuesday, September 22, 2015

Better gridding for global temperature

Better gridding for global temperature

Numerical integrtaion

Grid considerations

Improving on lat/lon grids

Cubic grid

Diffusion

Test of integration

Conclusion

14 comments:

Maintained Pages

Search This Blog

Recent Comments

Blogroll

Subscribe To

Blog Archive

Translate

Resources

About Me