## Saturday, October 17, 2015

### ## New integration methods for global temperature

To get a spatial average, you need a spatial integral. This process has been at the heart of my development of TempLS over the years. A numerical integral from data points ends up being a weighted sum of those points. In the TempLS algorithm, what is actually needed are those weights. But you get them by figuring out how best to integrate.

I started, over five years ago, using a scheme sometimes used in indices. Divide the surface into lat/lon cells, find the average for each cell with data, then make an area-weighted sum of those. I've called that the grid version, and it has worked quite well. I noted last year that it tracked the NOAA index very closely. That is still pretty much true. But a problem is that some regions have many empty cells, and these are treated as if they were at the global average, which may be a biased estimate.

Then I added a method based on an irregular triangle mesh. basically, you linearly interpolate between data points and integrate that approximation, as in finite elements. The advantage is that every area is approximated by local data. It has been my favoured version, and I think it still is.

I have recently described two new methods, which I expect to be also good. My idea in pursuing them is that you can have more confidence if methods based on different principles give concordant results. This post reports on that.

The first new method, mentioned here, uses spherical harmonics (SH). Again you integrate an approximant, formed by least squares fitting (regression). Integration is easy, because all but one (the zeroth, constant) of the SH give zero.

The second I described more recently. It is an upgrade of the original grid method. First it uses a cubed sphere to avoid having the big range of element areas that lat/lon has near the poles. And then it has a scheme for locally interpolating grid values which have no internal data.

I have now incorporated all four methods as options in TempLS. That involved some reorganisation, so I'll call the result Ver 3.1, and post it some time soon. But for now, I just want to report on that question of whether the "better" methods actually do produce more concordant results with TempLS.

The first test is a simple plot. It's monthly data, so I'll show just the last five years. For "Infilled", (enhanced grid) I'm using a 16x16 grid on each face, with the optimisation described here. For SH, I'm using L=10 - 121 functions. "Grid" and "Mesh" are just the methods I use for monthly reports.

The results aren't very clear, except that the simple grid method (black) does seem to be a frequent outlier. Overall, the concordance does seem good. You can compare with the plots of other indices here.

So I've made a different kind of plot. It shows the RMS difference between the methods, pairwise. By RMS I mean the square root of the average sum squares of difference, from now back by the number of years on the x-axis. Like a running standard deviation of difference.

This is clearer. The two upper curves are of simple grid. The next down (black) is of simple grid vs enhanced; perhaps not surprising that they show more agreement. But the advanced methods agree more. Best is mesh vs SH, then mesh vs infill. An interesting aspect is that all the curves involving SH head north (bad) going back more than sixty years. I think this is because the SH set allows for relatively high frequencies, and when large datafree sections start to appear, they can engage in large fluctuations there without restraint.

There is a reason why there is somewhat better agreement in the range 25-55 years ago. This is the anomaly base region, where they are forced to agree in mean. But that is a small effect.

Of course, we don't have an absolute measure of what is best. But I think the fact that the mesh method is involved in the best agreements speaks in its favour. The best RMS agreement is less than 0.03°C which I think is pretty good. It gives more confidence in the methods, and, if that were needed, in the very concept of a global average anomaly.

1. Couldn't you use a subsample of reanalysis data to determine which method is "best?" That is, take the reanalysis data where actual measurements exist and compare the estimated global average to the known global average? You could also create an error model for each method, and then combine them, which would probably give you a result more accurate than any one method.

1. CCE,
I'd have trouble accepting that reanalysis was good enough to be the decider. Else why not use it in preference to the index from measurements? Certainly I think the surface indices agree with each other better than with NCEP/NCAR.

Also reanalysis over sea usually shows air temp rather than SST.

2. A good reanalysis is going to have spacial patterns that are that are similar to the real world. Certainly, this would be superior to assuming we know nothing about poorly sampled areas. A method that works best on incomplete reanalysis output should work best on incomplete observations.

Also, don't reanalyses output all manner of data, including SST? SST goes into them.

1. CCE,
I'm sure reanalyses take in SST, but they are using an atmospheric model, so I don't think they could do much more than echo the input. NCEP/NCAR does not supply an SST dataset.

Yes, you can make use of reanalysis spatial patterns. That is what Steig et al did in their Antarctic paper. It was also part of the Cowtan and Way strategy. It's basically a question of whether an EOF gives a sufficiently better interpolation shape (vs linear in mesh or SH) to overcome the difficulty of matching it to the data.

3. I'm not suggesting you develop a new method a la C&W (although you could use MSU instead of reanalysis). I'm saying you can use spatially complete reanalysis output to determine which one of your methods does the best job of estimating the global average anomaly with partial data. Subsample the output to match the locations of real observations. Run each method on the subsample. Compare the estimate to the known result, either globally or limit to under sampled areas. I doubt if it would matter if you use SST or SAT provided that both are spatially similar.

Whichever method works best, probably works best in the real world.

4. CCE,
wouldn't you expect that gridded would give the best results if you did that subsampling procedure just on the basis that the reanalysis dataset is gridded so if you're trying to reconstruct the pattern it would be most easily done by a like method?

1. I don't know, you guys are the experts. :)

I suppose you could create synthetic "stations" at the centroids of each reanalysis grid cell and calculate the global average anomaly using the other methods and compare results. If the grid is globally complete, shouldn't the results be nearly identical?

5. You could test the methods on an output of a climate model. There are the temperature is known at all points, there is no missing or uneven distributed measurements. So the global mean temperature of the modelled temperature field can easily calculated as reference. Then you apply the methods to that temperature field, using only data points at positions where in the real world data exists. Then you can compare the output of the methods to the know true global mean temperature of the simulated temperature field.

1. Anon,
I did something like that here. I integrated spherical harmonics with the pattern of missing data, and compared with the true imtegral. That is a test of the actual numerical integration process. Now it's true that GCM temp might give a different kind of interpolation than harmonics. So yes, I could try that.

I think this is similar to CCE's suggestion. Definitely worth thinking about.

2. That's what I was thinking, but using reanalysis output since it's supposed to match "the real world" as closely as possible.

3. Even if the resolution of the GCM were the same as the reanalysis code (I don't think it is), there's no reason to expect that the temperature field coming from a particular model run would have a thing to do with how the temperature of the Earth has been changing over the last 150 years.

As cue points out, the reanalysis codes are at least constrained to match the known temperature, pressure and windspeed data. As such, you'd expect an improved interpolation over e.g. spherical harmonics.

4. That said---Nick, have you thought about looking at the time-evolution of the spherical harmonics moments?

ell = 0 is of course the "global mean temperature".

ell =1 represents the dipole moment. E.g., it's a measure of NH-SH asymmetry.

ell = 2 the quadrupole moment etc.

It'd be interesting to see if the coefficients evolve over time relative to each other.

(For the pure masochists who walk among us, an EOF-type analysis of the coefficients would be interesting.)

Just a thought.

5. Nick, I see.
I think the advantage using a GCM simulation is, that it is not an interpolation and not using measured data. So it can by seen as an Earth2 with a probably different temperature field, but where we can test the interpolation accuracy, because we know the exact result.
It is not necessary that the temperature field is the same as at the Earth in the last 150 years. You could use paleoclimatic simulations as well. Of course one has to use not the real world data but the simulated measurements in any case.

The reanalysis has the disadvantage that it is generated to fit or approximate different real world measurements and so contain already that measurement errors and also the problems from uneven distribution ab the measurements. So the reanalysis may contain artifacts from erroneous measurements in sparsely sampled areas or tries to interpolate between conflicting measurements. So I see the reanalysis data more as an highly sophisticated interpolation method, than a different Earth, where you can test the measurement and interpolation methods. In principle you could test the reanalysis procedure on the GCM result by doing the same measurements (same type and spacial and temporal distribution) as in the real world on the GCM result, doing the reanalysis procedure for this and than compare this reanalysis result with the original GCM result.
There is no need, that the GCM result match any real world data.

6. GCMs don't accurately capture mesoscale weather (they are meant to model climate not weather). Short period temporal variability is often not captured very well either. And the solutions at any time have nothing to do with what the real earth is doing. So it seems to me they'd be useless as interpolators.

I'm not sure I follow your argument about why constraining reanalysis codes by observations makes them worse, rather than better, interpolators. They are still constrained by the physics of the mesoscale weather, so errors that represent aphysical behavior get rejected, rather than incorporated into the reanalysis solution. This is pretty close to what an "optimal interpolator" is.

(You could do this for Nick's code too, if you computed the wind and pressure fields, too, then required that the solutions were consistent with Navier-Stokes.)

7. Carrick,
the argument is that the reanalysis is an interpolater and GCMs are not.
If you use the reanalysis as comparison, you basically compare different interpolaters to interpolating the almost same not evenly distributed data. You assume that the reanalysis gives the best interpolation of the true Earth temperature field, which may be correct, but you can't check it, because you don't know the true Earth temperature field. Even then the reanalysis come with the same limitations as the other interpolaters, only view measurements over Antarctica and the sea ice for example. Therefore also the reanalysis may bad in this regions, where the differences between the different interpolation techniques are important. So you will see such interpolation techniques as the best, that are close to the reanalysis. Maybe this problem would be less, if the reanalysis would base on additional thousands of measurements on the sea ice and Antarctica.
My suggestion is different, then comparing different interpolation techniques to each other. Then you may only get an interpolation technique that is close to the mean or a special other interpolation technique, for example the reanalysis.

What I suggest is not comparing different interpolation techniques to each other but comparing all the interpolation techniques to the exact real temperature field. The temperature field where the measurements are taken. So what we need is not an additional interpolater, as good as it may be, but the exact temperature field where the measurements are taken. If we would know the real exact temperature field of the Earth atmosphere at all points, it would be easy. We could simple calculate the error of the different interpolation techniques including the reanalysis to this field and know the size and temporal and spatial distribution of the error and which interpolation method is the best, and if the reanalysis is really much better.

Unfortunately we know not the real temperature of the Earth atmosphere at all points.

So I suggest the second best. Take a temperature field where you know the temperature at all points. Take measurements at the same points and times as in the real world, may be add plausible measurement errors, do the interpolation method you want to test with this and compare the interpolated result to the original temperature field where you know the temperature at all points.
Note, that the original temperature field contains more information then the measurement at only view points. So it is different to comparing different interpolation techniques.
Note, that this temperature field is only for testing the methods, so it need not be a good representation of the real atmospheric temperatures.

So you have to chose the temperature field to test the interpolation techniques.
My suggestion is to use a GCM output for that. I think that possible derivation from the real world does not matter much for this purpose.

Of course one could use the reanalysis results instead. But I think it is not so good, because it was fitted to real world temperature measurements at approximately the same places where the measurements will be taken. And some of the measurements used to fit may simple be wrong. Also the simulation time from the start to the data will not be so large then in a GCM simulation. All these things are, in my opinion a source of unnecessary additional errors, compared to a GCM output temperature field.

8. Don't tell ECMWF that - they're pretty proud of their short-term forecast accuracy. All of the major weather services use GCMs to produce ensemble predictions for weather, not just climate. But unlike climate, weather is an initial conditions problem, and we cannot know - much less resolve - with sufficient detail and accuracy the initial conditions. Still, at 8 days out they have a very firm grasp of mesoscale weather. Sandy and Joaquin show just how well they can do.

9. Carrick, I had written a very long and detailed explanation of my view, unfortunately it is vanished after I (seemingly successful!) send it, for unknown reason (maybe too long?). And I deleted my local copy, after sending it, seemingly successfully.

The main point was, that we don't need an additional interpolation method, as good as they may be, but a replacement for the real world, even if the temperature field may differ, where we can take measurements and test the different interpolation methods (in principle including reanalysis), which has the advantage over the real world, that we know the exact real temperature field, where the measurements are taken, and this exact temperature field is not influence by measurements, their location, measurement errors, or is interpolated by any method from measurements.

10. Now I see, that my long post is found. Thank you, Nick!

11. Nick, the paper Cowtan et al 2015 you linked here http://moyhu.blogspot.de/2015/10/hansens-1988-predictions-revisited.html?showComment=1445984835464#c6476607534027199815
seems to have already done an approach similar to that, I had suggested.
My main focus would be to compare the interpolation methods to find that one which is most close to the SAT of the models.

6. Carrick,
My main project at the moment is a scheme to get a whole lot of data expressed as SH coefficients. At say, 121 points per month/world, I can upload a lot. Then it can be used interactively in many ways. I was thinking of beinmg able to show trends, averages etc for user chosen periods. But the data would be there to show time series of coefficients etc. If course the time series of the zero coefficient is my SH integral.

As for EOFs, I've thought a bit more about that. It's an attractive idea. But there would be the same problem as with SH - you'd need spatial resolution, and that would need a comparable number of reanalysis EOFs, say 121. Now I don't think you can get that number of meaningful EOFs, so for higher frequencies, it would be no better than SH. And for low frequencies, interpolating missing data, the difference between SH and reanalysis EOF's wouldn't be perceptible.

Still, this connects with your idea of following the moments of SH fits. EOFs would be even more significant.