moyhu: Testing integration methods in TempLS V4.

I posted yesterday about some new methods and changes in integration in TempLS V4, explaining how the task is central to the formation of global temperature anomaly averages. Today I will show the results of testing the methods. We can't test to see if actual data gives the right result, because we don't know that result. So tests are of two types

Testing integration of functions whose average is known. Values of those functions replace the observed temperatures in the process, with the same station locations and missing values.
Comparing different methods, to see if some converge on a common answer, and which do not.

I tested six methods. They are, in order (approx) of ascending merit

Simple average of stations with no area weighting.
Simple gridding, with cells without data simply omitted. The grid is cubed sphere with faces divided into 24x24, with a total of 3456 cells. Cell area is about 25000 sq km, or 275 km edge.
Zero order LOESS method, as described in last post. Zero order is just a local weighted average.
Gridding with infill. Empty cells acquire data from neighboring stations by solving a diffusion equation, as briefly described in the last post.
First order LOESS, in which values on a grid of nodes are calculated by local linear regression
Finite element style integration on a triangular mesh with observation points at the vertices.

I said the order was approximate; there is some indication that full LOESS may be even slightly better than mesh. I also tested the effect of using spherical harmonics fits for enhancement, as described here. This option is now available in TempLS V4. The parameter nSH is a measure of the number of periods around the Earth - see here for pictures. For each level of nSH, there are (nSH+1)^2 functions.

Testing known functions

An intuitively appealing is simply latitude. Using just the latitude of stations when they report, in January 2011, what does the process give for the average latitude of the Earth. It should of course be zero.

	nSH=0	nSH=2	nSH=4	nSH=8	nSH=12
No weighting	26.6295	-0.0393	-0.0603	0.0479	1.9308
Grid no infill	0.5095	0.0806	-0.0121	-0.035	-0.0197
LOESS order 0	0.022	-0.0342	-0.0438	-0.0342	-0.0216
Grid diffuse	-0.0432	-0.0466	-0.037	-0.0267	-0.0175
LOESS order 1	-0.023	-0.0231	-0.0252	-0.0223	-0.0182
Mesh FEM	-0.0209	-0.022	-0.0218	-0.0177	-0.0129

The case with no weighting gives a very bad result. A very large number of GHCN V4 stations are in the US, between lat 30° and 49°, and that pulls the average right up. Grid with no infill also errs on the N side. The reason here is that there are many cells in the Antarctic and near without data. Omitting them treats them as if they had average latitude (about 0), whereas of course they should be large negative. The effect is to return a positive average. The other methods work well, because they are not subject to this bias. They have errors of local interpolation, which add to very little. The poor methods improve spectacularly with spherical harmonic (SH) enhancement. This does not fix the biased sampling, but it corrects the latitude disparity that interacts with it. Deducting the non-constant spherical harmonics, which are known to have average zero, leaves a residual with is not particularly biased by hemisphere. The no weighting case is bad again at high order SH. The reason is that the fitting is done with that method if integration, which becomes unreliable for higher order functions. I'll say more about that in the next section.

Testing spherical harmonics.

Latitude is a limited test. SH's offer a set of functions with far more modes of variation, but testing them all would return a confusing scatter of results. There is one summary statistic which I calculate that I think is very helpful. When you fit SH Fourier style, it is actually a regression, involving the inverse of the matrix of integrals of products of SH. This is supposed to be the identity matrix, but because of approximate integrations, the SH are not exactly orthogonal. The important number indicating deviation from unit is the condition number, or ratio of max to min eigenvalues. When it gets too low, the inversion fails, as does the fitting. That is what was happening to unweighted averaging with nSH=12. So I calculated the minimum eigenvalue of that matrix (max remains at 1). Here is a table:

	nSH=2	nSH=4	nSH=8	nSH=12
No weighting	0.0703	0.0205	-0.8361	-0.9992
Grid no infill	0.7893	0.631	0.4354	0.1888
LOESS order 0	0.9501	0.8298	0.4593	0.1834
Grid diffuse	0.9694	0.9013	0.6026	0.2588
LOESS order 1	0.988	0.9503	0.6905	0.3037
Mesh FEM	0.9875	0.9418	0.6677	0.3204

Close to 1 is good. There is a fairly clear ranking of merit, with full LOESS just pipping mesh. No weighting is never good, and completely falls apart with high nSH, being not even positive definite. All methods are down quite a lot at nSH=12, although a minimum of 0.3 say is not a major problem for inversion. These values are for January 2011, but it is a robust statistic, varying little from year to year. Here is the comparison over years for the nSH=8 level:

	2011	2012	2013	2014	2015	2016	2017	2018	2019
No weighting	-0.8361	-0.759	-0.2471	-0.2119	-0.0423	-0.0104	0.01	0.0167	0.0199
Grid no infill	0.4354	0.465	0.4481	0.4129	0.4493	0.4237	0.4394	0.4325	0.4436
LOESS order 0	0.4593	0.4972	0.4742	0.4477	0.4791	0.4472	0.4734	0.4697	0.4764
Grid diffuse	0.6026	0.6207	0.6102	0.5741	0.6132	0.5929	0.6127	0.608	0.6095
LOESS order 1	0.6905	0.711	0.6965	0.6762	0.6912	0.6832	0.6903	0.6927	0.6944
Mesh FEM	0.6677	0.6663	0.6703	0.6636	0.6627	0.6737	0.6741	0.6772	0.6626

Test of global average of real anomalies.

For consistency, I have used real temperature data with the station normals calculated by the mesh method. It would not matter which method was used, as long as it is consistent. Here are January results over years with no SH enhancement:

	2011	2012	2013	2014	2015	2016	2017	2018	2019
No weighting	-0.1565	1.2277	0.5341	0.3475	1.0401	0.7213	0.9881	0.8065	0.8183
Grid no infill	0.3254	0.2995	0.4834	0.5251	0.6729	0.9464	0.8194	0.6127	0.7443
LOESS order 0	0.366	0.2988	0.5224	0.5665	0.6649	1.0036	0.8499	0.619	0.7425
Grid diffuse	0.3674	0.2919	0.5113	0.5443	0.6495	0.9802	0.8283	0.6186	0.7111
LOESS order 1	0.3706	0.2994	0.5057	0.5397	0.6429	0.9932	0.8281	0.6209	0.7112
Mesh FEM	0.3779	0.2862	0.5051	0.517	0.6474	0.9969	0.8088	0.6178	0.7017

The best methods are really very close, generally within a range of about 0.02. Even simple grid (with cubed sphere) is not that different. But no weighting is bad. The effects of SH enhancement are variable. I'll show them for January 2018:

	nSH=0	nSH=2	nSH=4	nSH=8	nSH=12
No weighting	0.8065	0.8842	0.5687	0.5633	1.4847
Grid no infill	0.6127	0.6253	0.6077	0.6165	0.6139
LOESS order 0	0.619	0.6196	0.6138	0.6224	0.6152
Grid diffuse	0.6186	0.6194	0.6191	0.6241	0.6237
LOESS order 1	0.6209	0.621	0.6198	0.6225	0.619
Mesh FEM	0.6178	0.6182	0.618	0.6206	0.6147

The better methods do not change much, but converge a little more, and are joined by the lesser methods. This overall convergence based on the separate principles of discretisation type (mesh, grid etc) and SH enhancement is very encouraging. Even no weighting becomes respectable up to nSH=8, but then falls away again as the high order fitting fails. I'll show in a future post the comparison results of the different methods for the whole time series. There are some earlier comparisons here, which ran very well back to 1957, but were then troubled by lack of Antarctic data. I think LOESS will perform well here.

moyhu

Tuesday, April 9, 2019

Testing integration methods in TempLS V4.

Testing known functions

Testing spherical harmonics.

Test of global average of real anomalies.

No comments:

Post a Comment