Wednesday, September 11, 2019

How errors really propagate in differential equations (and GCMs).

There has been more activity on Pat Frank's paper since my last post. A long thread at WUWT, with many comments from me. And two good posts and threads at ATTP, here and here. In the latter he coded up Pat's simple form (paper here). Roy Spencer says he'll post a similar effort in the morning. So I thought writing something on how error really is propagated in differential equations would be timely. It's an absolutely core part of PDE algorithms, since it determines stability. And it isn't simple, but expresses important physics. Here is a TOC:

Differential equations

An ordinary differential equation (de) system is a number of equations relating many variables and their derivatives. Generally the number of variables and equations is equal. There could be derivatives of higher order, but I'll restrict to one, so it is a first order system. Higher order systems can always be reduced to first order with extra variables and corresponding equations.

A partial differential equation system, as in a GCM, has derivatives in several variables, usually space and time. In computational fluid dynamics (CFD) of which GCMs are part, the space is gridded into cells or otherwise discretised, with variables associated with each cell, or maybe nodes. The system is stepped forward in time. At each stage there are a whole lot of spatial relations between the discretised variables, so it works like a time de with a huge number of cell variables and relations. That is for explicit solution, which is often used by large complex systems like GCMs. Implicit solutions stop to enforce the space relations before proceeding.

Solutions of a first order equation are determined by their initial conditions, at least in the short term. A solution beginning from a specific state is called a trajectory. In a linear system, and at some stage there is linearisation, the trajectories form a linear space with a basis corresponding to the initial variables.

Fluids and Turbulence

As in CFD, GCMs solve the Navier-Stokes equations. I won't spell those out (I have an old post here), except to say that they simply express the conservation of momentum and mass, with an addition for energy. That is, a version of F=m*a, and an equation expressing how the fluid relates density and velocity divergence (and so pressure with a constitutive equation), and an associated heat budget equation.

It is said, often in disparagement of GCMs, that they are not effectively determined by initial conditions. A small change in initial state could give a quite different solution. Put in terms of what is said above, they can't stay on a single trajectory.

That is true, and true in CFD, but it is a feature, not a bug, because we can hardly ever determine the initial conditions anyway, even in a wind tunnel. And even if we could, there is no chance in an aircraft during flight, or a car in motion. So if we want to learn anything useful about fluids, either with CFD or a wind tunnel, it will have to be something that doesn't require knowing initial conditions.

Of course, there is a lot that we do want to know. With an aircraft wing, for example, there is lift and drag. These don't depend on initial conditions, and are applicable throughout the flight. With GCMs it is climate that we seek. The reason we can get this knowledge is that, although we can't stick to any one of those trajectories, they are all subject to the same requirements of mass, momentum and energy conservation, and so in bulk all behave in much the same way (so it doesn't matter where you started). Practical information consists of what is common to a whole bunch of trajectories.

Turbulence messes up the neat idea of trajectories, but not too much, because of Reynolds Averaging. I won't go into this except to say that it is possible to still solve for a mean flow, which still satisfies mass momentum etc. It will be a useful lead in to the business of error propagation, because it is effectively a continuing source of error.

Error propagation and turbulence

I said that in a first order system, there is a correspondence between states and trajectories. That is, error means that the state isn't what you thought, and so you have shifted to a different trajectory. But, as said, we can't follow trajectories for long anyway, so error doesn't really change that situation. The propagation of error depends on how the altered trajectories differ. And again, because of the requirements of conservation, they can't differ by all that much.

As said, turbulence can be seen as a continuing source of error. But it doesn't grow without limit. A common model of turbulence is called k-ε. k stands for turbulent kinetic energy, ε for rate of dissipation. There are k source regions (boundaries), and diffusion equations for both quantities. The point is that the result is a balance. Turbulence overall dissipates as fast as it is generated. The reason is basically conservation of angular momentum in the eddies of turbulence. It can be positive or negative, and diffuses (viscosity), leading to cancellation. Turbulence stays within bounds.

GCM errors and conservation

In a GCM something similar happens with other perurbations. Suppose for a period, cloud cover varies, creating an effective flux. That is what Pat Frank's paper is about. But that flux then comes into the general equilibrating processes in the atmosphere. Some will go into extra TOA radiation, some into the sea. It does not accumulate in random walk fashion.

But, I hear, how is that different from extra GHG? The reason is that GHGs don't create a single burst of flux; they create an ongoing flux, shifting the solution long term. Of course, it is possible that cloud cover might vary long term too. That would indeed be a forcing, as is acknowledged. But fluctuations, as expressed in the 4 W/m2 uncertainty of Pat Frank (from Lauer) will dissipate through conservation.

Simple Equation Analogies

Pat Frank, of course, did not do anything with GCMs. Instead he created a simple model, given by his equation 1:

It is of the common kind, in effect a first order de

d( ΔT)/dt = a F

where F is a combination of forcings. It is said to emulate well the GCM solutions; in fact Pat Frank picks up a fallacy common at WUWT that if a GCM solution (for just one of its many variables) turns out to be able to be simply described, then the GCM must be trivial. This is of course nonsense - the task of the GCM is to reproduce reality in some way. If some aspect of reality has a pattern that makes it predictable, that doesn't diminish the GCM.

The point is, though, that while the simple equation may, properly tuned, follow the GCM, it does not have alternative trajectories, and more importantly does not obey physical conservation laws. So it can indeed go off on a random walk. There is no correspondence between the error propagation of Eq 1 (random walk) and the GCMs (shift between solution trajectories of solutions of the Navier-Stokes equations, conserving mass momentum and energy).

On Earth models

I'll repeat something here from the last post; Pat Frank has a common misconception about the function of GCM's. He says that
"Scientific models are held to the standard of mortal tests and successful predictions outside any calibration bound. The represented systems so derived and tested must evolve congruently with the real-world system if successful predictions are to be achieved."

That just isn't true. They are models of the Earth, but they don't evolve congruently with it (or with each other). They respond like the Earth does, including in both cases natural variation (weather) which won't match. As the IPCC says:
"In climate research and modelling, we should recognise that we are dealing with a coupled non-linear chaotic system, and therefore that the long-term prediction of future climate states is not possible. The most we can expect to achieve is the prediction of the probability distribution of the system’s future possible states by the generation of ensembles of model solutions. This reduces climate change to the discernment of significant differences in the statistics of such ensembles"

If the weather doesn't match, the fluctuations of cloud cover will make no significant difference on the climate scale. A drift on that time scale might, and would then be counted as a forcing, or feedback, depending on cause.


Error propagation in differential equations follows the solution trajectories of the differential equations, and can't be predicted without it. With GCMs those trajectories are constrained by the requirements of conservation of mass, momentum and energy, enforced at each timestep. Any process which claims to emulate that must emulate the conservation requirements. Pat Frank's simple model does not.

Sunday, September 8, 2019

Another round of Pat Frank's "propagation of uncertainties.

See update below for a clear and important error.

There has been another round of the bizarre theories of Pat Frank, saying that he has found huge uncertainties in GCM outputs that no-one else can see. His paper has found a publisher - WUWT article here. It is a pinned article; they think it is a big deal.

The paper is in Frontiers or Earth Science. This is an open publishing system, with (mostly) named reviewers and editors. The supportive editor was Jing-Jia Luo, who has been at BoM but is now at Nanjing. The named reviewers are Carl Wunsch and Davide Zanchettin.

I wrote a Moyhu article on this nearly two years ago, and commented extensively on WUWT threads, eg here. My objections still apply. The paper is nuts. Pat Frank is one of the hardy band at WUWT who insist that taking a means of observations cannot improve the original measurement uncertainty. But he takes it further, as seen in the neighborhood of his Eq 2. He has a cloud cover error estimated annually over 20 years. He takes the average, which you might think was just a average of error. But no, he insists that if you average annual data, then the result is not in units of that data, but in units/year. There is a wacky WUWT to-and-fro on that beginning here. A referee had objected to changing the units of annual time series averaged data by inserting the /year. The referee probably thought he was just pointing out an error that would be promptly corrected. But no, he coped a tirade about his ignorance. And it's true that it is not a typo, but essential to the arithmetic. Having given it units/year, that makes it a rate that he accumulates. I vainly pointed out that if he had gathered the data monthly instead of annually, the average would be assigned units/month, not /year, and then the calculated error bars would be sqrt(12) times as wide.

One thing that seems newish is the emphasis on emulation. This is also a WUWT strand of thinking. You can devise simple time models, perhaps based on forcings, which will give similar results to GCMs for one particular variable, global averaged surface temperature anomaly. So, the logic goes, that must be what GCM's are doing (never mind all the other variables they handle). And Pat Frank's article has much of this. From the abstract: "An extensive series of demonstrations show that GCM air temperature projections are just linear extrapolations of fractional greenhouse gas (GHG) forcing." The conclusion starts: "This analysis has shown that the air temperature projections of advanced climate models are just linear extrapolations of fractional GHG forcing." Just totally untrue, of course, as anyone who actually understands GCMs would know.

One funny thing - I pointed out here that PF's arithmetic would give a ±9°C error range in Hansen's prediction over 30 years. Now I argue that Hansen's prediction was good; some object that it was out by a small fraction of a degree. It would be an odd view that he was extraordinarily lucky to get such a good prediction with those uncertainties. But what do I see? This is now given, not as a reduction ad absurdum, but with a straight face as Fig 8:

To give a specific example of this nutty arithmetic, the paper deals with cloud cover uncertainty thus:

"On conversion of the above CMIP cloud RMS error (RMSE) as ±(cloud-cover unit) year-1 model-1 into a longwave cloud-forcing uncertainty statistic, the global LWCF calibration RMSE becomes ±Wm-2 year-1 model-1. Lauer and Hamilton reported the CMIP5 models to produce an annual average LWCF root-mean-squared error (RMSE) = ±4 Wm-2 year-1 model-1, relative to the observational cloud standard (81). This calibration error represents the average annual uncertainty within the simulated tropospheric thermal energy flux and is generally representative of CMIP5 models."

There is more detailed discussion of this starting here. In fact, Lauer and Hamilton said, correctly, that the RMSE was 4 Wm-2. The year-1 model-1 is nonsense added by PF, but it has an important effect. The year-1 translates directly into the amount of error claimed. If it had been month-1, the claim would have been sqrt(12) higher. So why choose year? PF's only answer - because L&H chose to bin their data annually. That determines GCM uncertainty!

Actually, the ±4 is another issue, explored here. Who writes an RMS as ±4? It's positive. But again it isn't just a typo. An editor in his correspondence, James Annan wrote it as 4, and was blasted as an ignorant sod for omitting the ±. I pointed out that no-one, nor L&H in his reference, used a ± for RMS. It just isn't the meaning of the term. I challenged him to find that usage anywhere, with no result. Unlike the nutty units, I think this one doesn't affect the arithmetic. It's just an indication of being in a different world.

One final thing I should mention is the misunderstanding of climate models contained in the preamble. For example "Scientific models are held to the standard of mortal tests and successful predictions outside any calibration bound. The represented systems so derived and tested must evolve congruently with the real-world system if successful predictions are to be achieved."

But GCMs are models of the earth. They aim to have the same physical properties but are not expected to evolve congruently, just as they don't evolve congruently with each other. This was set out in the often misquoted IPCC statement

"In climate research and modelling, we should recognise that we are dealing with a coupled non-linear chaotic system, and therefore that the long-term prediction of future climate states is not possible. The most we can expect to achieve is the prediction of the probability distribution of the system’s future possible states by the generation of ensembles of model solutions. This reduces climate change to the discernment of significant differences in the statistics of such ensembles. "

Update - I thought I might just highlight this clear error resulting from the nuttiness of the /year attached to averaging. It's from p 12 of the paper:

Firstly, of course, they are not the dimensions (Wm-2) given by the source, Lauer and Hamilton. But the dimensions don't work anyway. The sum of squares gives a year-2 dimension component. Then just taking the sqrt brings that back to year-1. But that is for the uncertainty of the whole period, so that can't be right. I assume Pat Frank puts his logic backward, saying that adding over 20 years multiplies the dimensions by year. But that still leaves the dimension (Wm-2)2 year-1, and on taking sqrt, the unit is (Wm-2)year-1/2. Still makes no sense; the error for a fixed 20 year period should be Wm-2.

Saturday, September 7, 2019

August global surface TempLS down 0.057°C from July.

The TempLS mesh anomaly (1961-90 base) was 0.772deg;C in August vs 0.829°C in July. This contrasts with the 0.017°C rise in the NCEP/NCAR reanalysis base index. This makes it the second warmest August in the record, after 2016.

After three months of rise, SST was down by a little. There was a large cold blob NE of Japan. Eurpoe was warm, except for Russia, but Siberia was mostly warm. The US was mixed, NW Canada cold. Antarctica was mostly warm. Africa was warm, and made the largest contribution to the global warmth.

Here is the temperature map, using the LOESS-based map of anomalies.

Tuesday, September 3, 2019

August NCEP/NCAR global surface anomaly up 0.017°C from July

The Moyhu NCEP/NCAR index rose from 0.372°C in July to 0.389°C in August, on a 1994-2013 anomaly base. Like the last two months, it was uneventful but globally quite warm.

NW Canada down into the US midwest was cool. E Siberia was warm, but W Russia was cool. The extremes were around Antarctica, with parts of the adjacent ocean being very cool, but warm areas on land, especially W Antarctica. Australia was cool.

Friday, August 30, 2019

Mapping projections for climate data

When I first planned to post sets of climate data, usually temperature, as color maps, I thought about what kind of projection to use. There is some general discussion here. I decided to mostly settle for simple lat/lon or equirectangular plotting for routine use, but making use of Javascript active facilities to present an actual sphere (simply projected on the screen) where better geometry was needed. I first used a small set of views which the user could switch between, and then a HTML 5 version in which the user could choose an arbitrary view of the sphere by clicking on a small map. Then I moved on to WebGL globes which could be dragged by mouse. This has been further developed using a general WebGL facility which simplifies coding and data input and gives useful default capabilities. Incidentally, I still find the key map a useful way of controlling the globe.

But I have come back to thinking about plane projections. One reason is that I post images on Twitter, where I can't use Javascript. Another is that I compare TempLS output with GISS and other indices, and often their default presentation is with some other projection. So I have looked at schemes such as Mollweide and Robinson, the latter being used by GISS and others.

This has linked in with another aspect of what I do, which is using Platonic solids as a basis for gridding for integration of temperature, and perhaps for graphics. As a by-product, I showed some equal area projections based on the cubed sphere. I have recently been using icosahedral grids in the same way, and that yields equal area maps with even less distortion, but which are also particularly informative for these applications.

So in this post I want to touch on some of these points. First the conventional projections.

Mollweide and Robinson

These are of course well-established. But I found the Wiki versions of the implementation unnecessarily hard to implement, so I'll describe in an appendix some simple approximating formulae. The Mollweide projection achieves equal area at a cost of some distortion near poles. Equal area can be desirable, as for example in not exaggerating the weather variations near poles. One particular use I will describe is in displaying the extent of coverage of temperature stations. Here is a typical map:

I am not a fan of the Robinson projection. It is billed as a compromise, but I think it is thus neither fish nor fowl. It is not much closer to equal area than lat/lon, and it does little to reduce distortion. It seems to be popular because it seems not unfamiliar to people used to equirectangular. However, it should be easy to implement, so there is little to lose by using it. I say should be, but in fact they make a considerable rigmarole, presenting a table of numbers to be used, and even fussing about how to interpolate. This may be because there are still ownership issues, although there seem to be no restrictions on use. Anyway, I'll describe my way of implementing in the appendix.

Icosahedral projection

I'm developing yet another method of integration, based on FEM on an icosahedral mesh. I'll post soon. But meanwhile it makes for an interesting projection.

It is very close to equal area. This is achieved at the expense of more numerous cuts. It is not so easy to figure out, and may not be the best projection for this purpose. But it is more useful for showing the distribution of stations, where area density is more critical. I'm using it to see reasons for possible failure of an integration scheme based on icosahedra. Here is the plot:

What next?

I'll try using the Robinson projection for the regular TempLS reports, mainly to make it easier to compare with others. I'll probably also occasionally show a Mollweide projection.

Appendix - calculation methods

As I mentioned, I found the Wiki versions surprisingly clunky, so I spent some time developing simpler ways. I need that partly because I have to be able to reverse. When I show a color plot, I need to define a grid of points on the projection, map back to the sphere for interpolation, and then bring the colors back.

Both the Mollweide and Robinson projections have a similar simple basis. There is one mapping of latitudes from old to new, and then a formula for scaling the longitudes. Latitudes remain as horizontal lines. So two functions of a single variable (each) need to be approximated. There is a little difficulty with singularity in the Mollweide.

In each case, the latitude transform is an odd function (about zero latitude), and the longitude scale is even. I require the approximation to be exact at the equator and poles, and a least squares fit elsewhere. So generally, if the original coordinates are (x,y), x=longitude, angles in right angle units, and the projection coords are (X,Y), the mapping is

Y=f(y)=y+y(1-y²)(a₀ +a₁y² +a₂y4 +...)
X=x*g(y)=1-d*y²+y²(1-y²)(b₀ +b₁y² +b₂y4 +...)
The inverses are
y=F(Y)=Y+Y(1-Y²)(c₀ +c₁Y² +c₂Y4 +...)

For the Robinson projection I find the coefficients by fitting to the data tabulated by Wiki. d is just what is needed to get X right at y=1, so = 1-X there. From the table, that is 1-0.5322.

The coefficients are, to four terms:
b: -0.0885-0.35680.8331-1.0023
c: 0.782-0.66871.1220-0.8133

The Mollweide formula relates f(y) in terms of an intermediate variable a :
2a+sin(2a)=π sin(π/2 y)}}
So I generate a sequence of values of a between 0 and π/2, and hence y and Y sequences that I can use for fitting, to get f and F. But first there is a wrinkle near y=1; the expansion of the second equation there relates (1-a)³ on the left to (1-y)², and it is necessary to take fractional powers. The simplest thing to do is to first transform
and then do the expansion in z. Then it is necessary to map back again

However, a simplification is now that the function g(y) is just sqrt(1-y²). And d=1. So the series are:
c: (-0.06395,-0.0142,0.00135,-0.0085)
All these approximations give considerably better than three sig fig accuracy, which means error much less than a pixel.

Friday, August 16, 2019

GISS July global up 0.01°C from June.

The GISS V4 land/ocean temperature anomaly rose 0.01°C in July. The anomaly average was 0.93°C, up from June 0.92°C. It compared with a 0.061°C rise in TempLS V4 mesh I originally posted a 0.33°C rise, but this drifted upward with late data). As with TempLS, it was the warmest July in the GISS record, also by a considerable margin (0.08°C warmer than 2016).

The overall pattern was similar to that in TempLS. Hot in Africa and most of Europe, extending into Central Asia. Cool in NW Russia. Coolish in much of N America, but warm in Alaska, extending across Bering Strait. Warm in Greenland, mixed in Antarctica.

As usual here, I will compare the GISS and earlier TempLS plots below the jump.

Friday, August 9, 2019

Active WebGL plot of decadal regional temperature trends using ERSST V5 and GHCN V4

I have maintained a page of local trends over periods that users could choose. It was based on GHCN V3, and mesh display, and can still be seen here. But I need to upgrade to GHCN V4, and I have decided to update to LOESS graphics as well. But there is one further upgrade - instead of a choice of a fixed number of intervals ending in present, you can now choose any period of decades back to 1900. The maths of this is quite interesting, and I'll say more below. The new page is here, with the link in the page list top right.

The plot itself is the usual WebGL trackball. You can drag the globe around, or more quickly relocate by clicking on the small map above. Clicking on the plot shows the trend for the chosen period at that location. You can choose periods with the buttons on the right. The endpoints are colored, so the start state of 1980 and 2020 means the period will be Jan 1980 to Dec 2019, with missing months suitably handled. If you click outside the range, the range will extend; if you click inside, the red color will move to your choice. If you wanted to move the pink end, click the pink button to make it red. When you have chosen, click the Show button at the top to get the new plot. The average global trend for the period will show at the bottom as well.


As usual, the sphere is a trackball that you drag into position, or you can quickly set by clicking on the small map at the top. Beside that map are checkboxes which let you switch the objects displayed. The icosahedral mesh and nodes are initially not shown.

The map is created by first getting monthly averages on the 5762 icosahedral nodes, as described here and here. The trends are then calculated on those nodes. The LOESS method takes a weighted local linear regression on the closest station/SST data, even if it is not close at all. In Antarctica, for example, before 1960 that generally means ocean data. So trends for Antarctica for early times should be ignored. Elsewhere, there is loss of resolution according to station data, but it is still reasonably based. GHCN V4, of course, has much better coverage than V3.

Note that the color scheme is centered for zero trend, but the range varies with the length of period.


I think there is a lot to learn from the graphic, and I'll write a more detailed post. For example, recent periods show the extent to which warming dominates the Arctic, but if you look at the most recent decade, it's more mixed, with pronounced cooling over Greenland and the Canadian islands, but warming around Bering Strait. In earlier periods The warming extends to N Siberia as well.

It's interesting to look at the period 1910-1940, often used by skeptics to say that AGW is refute. It's often accompanied by a whinge about how that warming is being suppressed, often showing a plot of Hansen in 1981 or NCAR in 1974 to claim that their warming has been watered down since.

But this plot shows what was happening. Again Arctic warming dominates, and to a lesser extent, N America and N Atlantic. But the S Hemisphere and most oceans show very little warming. Those earlier plots were land only, with data heavily weighted to the N Hemisphere. The reduced warming in later calculations have the advantage of this knowledge.

I originally started out here to do a corresponding plot of the differences between GHCN V3 and V4, and that will be an upcoming post. It is working - just a little more checking.

Trend methods

I'll just say a little about the data handling here. I try to keep the volume of data down; not so much for my web storage, but because of download time. So I used moments. The zero'th moment of numbers yₖ with location is just the sum. The first central moment is the sum Σ(k-kₒ)yₖ, where kₒ is trhe mean. And the trend is just the first central moment, normalised by division by the moment of a unit trend.

There is a trick with moments familiar from calculating angular momentum, say. To get the moment of some bodies, you can just add the moments of their masses (zeroth moments) at their centres of mass, and add in the central first moments of each body. So here, I can just calculate and transmit the zero'th and first moments of the decades, and then I can work out the trend for any sequence of decades.

Tuesday, August 6, 2019

July global surface TempLS up 0.033°C from June.

The TempLS mesh anomaly (1961-90 base) was 0.813deg;C in July vs 0.78°C in June. This is a little more than the 0.018°C rise in the NCEP/NCAR reanalysis base index.

In contrast to the reanalysis, though, it did make it the warmest July in the record, and by a significant margin - 0.744°C in 2016 was next. The prospect of July being warmest has started a number of "hottest month ever" stories. I don't join in that, because it is only true if you add the seasonal cycle that makes NH summer the global peak. There are good reasons why we quote anomalies rather than absolute temperatures, so deviating for this claim can only cause confusion. In any case the seasonal cycle is predictable; the news is the anomaly. And July was nowhere near the highest anomaly of all months. Last March, for example, was 0.982°C.

Interestingly, the increase was due mostly to an SST rise, as was last month. Rising SST tends to continue for longer periods. On land, Much of North America was cool, except Alaska, and some of E Canada. S Europe was warm, but there was a cool spot in NW Russia. There was a warm band through central Asia to Iran and on to the Sahara. Antarctica was mostly warm.

Update 15 Aug Readers following the TempLS report may have noticed some unexpected variations in the July average. About 4 days ago the mesh version dropped to about the June level, and more seriously, the SST dropped back. TempLS LOESS was unaffected, as were the maps and graphics. The reason was that I had been tinkering with the cross product library routine that the mesh area weighting uses. I have fixed that, and now the surprise is that August is now 0.061°C warmer that July, rather than the posted 0.033°C. But I think that is genuine - it had been drifting higher before the sudden drop.

I'll show the bar chart of contributions; these are basically temperatures weighted by area. Africa and Antarctica were most responsible for the rise.

Here is the temperature map, using the LOESS-based map of anomalies.

Saturday, August 3, 2019

July NCEP/NCAR global surface anomaly up 0.018°C from June

The Moyhu NCEP/NCAR index rose from 0.354°C in June to 0.372°C in July, on a 1994-2013 anomaly base. This index fell last month when most others rose, so it remains on the cool side. While there is much talk of July being the hottest ever, allowing for the seasonal cycle, in this NCEP/NCAR index it was behind 2016 at 0.414°C.

North America was mostly cool, except Alaska. S Europe was warm, but E Europe and W Russia were cool. There was a warm band through central Asia to Iran. Antarctica had both very warm and very cool patches, the cool being mostly ocean.

Thursday, August 1, 2019

Unadjusted GHCN V3 and V4 global differences due to coverage, not content.

In yesterday's post I did a graphical study of the differences of GHCN V3 and V4 anomalies for June 2019. The idea was to try to pin down what caused a small discrepancy in the global average for that month. Clive Best has been looking at this, and thinks there may be a systematic change in reported temperatures between versions. My conclusion was that the difference was not because V4 was reporting higher temperatures, but that the greater coverage fixed a problem in V3 that in sparsely covered areas, sea temperatures would have an influence on land. I should add that this is a potential problem in TempLS and in Clive's similar code based on triangle mesh. I think these are very good methods, but the main codes do have one feature that they process land and sea separately, which should reduce this effect.

Clive sought to establish a link by finding a common set of V3 and V4, and seeing whether the difference persisted. I wasn't sure of this, because while there may be a common inventory, the stations in any one month won't correspond. I proposed a test in which the locations of GHCN V3 stations reporting in a month would have values interpolated (by mesh) from the V4 result. But they would then be analysed using the V3 mesh. I suspected that the result would be closer to V3 than V4, which would indicate that the difference was due to the reduced coverage, rather than a discrepancy between the temperature levels of the measures.

I'm reporting on that here. To summarise what I did - I had results from a run of the mesh method for both V3 and V4, reported here. From the V4 directory, I copied the node locations Z4, the anomalies A4, the weights W4, and I rebuilt the mesh M4. TempLS does integration by calculating weights, so the global average is the scalar product W4.A4. From the V3 directory, I copied the locations Z3 and weights W3. Then I linearly interpolated A4 onto Z3, using the mesh M4, giving result Aa. The "hybrid average" is then the scalar product W3.Aa, and it is being compared with W4.A4 and W3.A3.

So here is the first plot. I am presenting them as difference plots, because, as you can see from the plots here, the differences are small and hard to see against the full range of variation of the anomalies. Here is a plot of the last 30 months, with unadjusted data.

The V4 plot seems to jump around more, and clearly the V3 plot is a better match. In reality, the V3 anomaly fluctuates just as much as V4, if not more, but the hybrid tracks better with V3. I'll show another plot of the last 60 Junes:

There are more interesting features here. back to about 1990, the hybrid fairly closely tracks V3, with quite a lot of variability between V3 and V4, mostly with V4 warmer. But before 1990 V4 calms down, and only occasional spurts from V3, now not tracked by the hybrid. It's hard to know what to make of this; except for V3 excursions, both V4 and V3 seem in agreement with the hybrid, and so with each other.

The story is slightly different with adjusted data, V3 and v4. Here is the plot of the last 30 months:

The V3 difference still jumps around a lot less, but there is a fairly persistent offset of up to 0.02°F. This might reasonably be attributed to an effect of homogenisation. Finally, here are the adjusted data averages differenced over the last 60 Junes.

The adjusted pattern reflects the same interesting features as the unadjusted, with big differences between V4 and V3, and the hybrid mostly tracking V3, but with a departure during the last decade. Before 1990, less variation generally, with no obvious tendency for the hybrid to preferentially track V3.

The purpose of this post was to give some quantitative backing to the effects on the average that might be expected from the patterns shown in the last post. The adherence of the unadjusted data hybrid to v3 does not support the idea that there is a recent decades bias in the V4 measured data relative to V3. The performance of adjusted data does suggest a small effect arising from the V4 data, with some rather interesting behaviour going back before 1990 (for Junes).

Wednesday, July 31, 2019

Why is this June hotter seen with GHCN V4 than V3? - and lots of active graphics.

This post is a follow-up to one a few days ago on differences seen calculating monthly global averages using TempLS and version 4 of GHCNrather than V3. It followed a post of Clive Best, who has a similar program, and was finding differences. I too found that June 2019 rose about 0.07°C while using GHCN V3 did not show a rise. I think overall differences are small, but I wanted to look at the underlying arithmetic.

So, as foreshadowed, I adapted my program to use its LOESS based calculation and graphics, in which I could calculate differences. But there was mission creep, as I found that being able to put disparate data on the same equally spaced grid made a lot of other things possible. So I showed also the effects of homogenisation. It does answer the question of why V4 made a difference this month, but there is a lot more to learn.

First let me tell the many uses of the main graphic, which is shown below. It is the familiar WebGL trackball Earth. You can drag it about and zoom. Click on the little map to quickly center at a chosen point. But importantly for this inquiry, you can control the content. The checkboxes top right let you switch on/off the display of V3 and V4 or SST nodes, or the shading (called "loess"), or even the map. And the radio buttons on the right give the choice of five data sets for June 2019, which are
  • Un V4-V3 which is the difference of TempLS anomalies using unadjusted GHCN data from V4 and V3.
  • Adj V4-V3 the corresponding difference using adjusted GHCN data (QCF, pairwise homogenised)
  • V4 Un - Adj the difference between unadjusted and adjusted data, for a V4 calculation
  • V3 Un - Adj same for V3
  • V4 Unadjusted just the TempLS anomalies using V4. It is the LOESS version of my regularly updated mesh plot.
So I'll show the plot here, with below an expanded discussion of what can be learnt from it.

Sunday, July 28, 2019

Comparison of surface temperature indices going from GHCN V3 to V4.

I have written quite lot about TempLS V4, which was prompted by the need to make use of the extended global land temperature database GHCN V4. However, the reality was that there really wasn't much difference, IMO. However, I saw that Clive Best, in posting his June average, had given chief prominence to the value calculated with V3, and this showed a drop of 0.04°C. Now Clive's method of triangulation is similar to my TempLS mesh, although I use ERSST V5 for ocean rather than his HADSST3. And we usually get very similar results. However, this result was rather different from my 0.067°C rise. Other indices generally agreed with mine.

Looking further, it turns out that calculating with V4 gave a rise of 0.04°C, closer to the finding of others. But he was inclined to emphasise the difference, leading up to a striking tweet in which he said that transition from V3 to V4 was responsible for a 0.2°C discrepancy. So maybe that needs more attention.

An extra oddity came when his post was reposted at GWPF, where their spin was that "The global averaged surface temperature for June 2019 was 0.62C, back down to where it was before the 2015/16 El Nino" or, on Twitter, "Global Temperature Falling Again". This was rightly mocked on Twitter.

So, I ran TempLS again with V3, unadjusted and adjusted. It isn't clear to me which version Clive was using. But I regularly use unadjusted, and I think this gives the best guide to the changes in the dataset, as opposed to the effect of homogenisation.

The first thing to say is that I did get a very small change with V3, at 0.004°C rise. That is about 0.063°C less than with V4, a similar difference to Clive. As we'll see that is a moderately large discrepancy by past experience, although not an outlier. Anyway, let's see some graphs.

Wednesday, July 17, 2019

Comparison between global temperature indices following GHCN V4; changes since 2015

I had noticed that recently the concordance of GISS with the more advanced TempLS methods seemed to have improved, and I wondered whether there might be a general improvement associated with the adoption of GHCN V4, with the big increase in land stations. In 2015, I posted a study of the extent to which a rather large set of indices mutually agreed. It included land, SST and troposphere measures. I may revisit that. But for the moment, I want to look in a similar way at just the surface (land and ocean) indices. Since they seek to measure the same thing, differences can be attributed to method rather than physics.

In that earlier post, my measure was the standard deviation (sd) of differences between monthly index values over the most recent 35 years. That was to fit with the satellite data, which is not used here. But I will stick with the period (updated), because while there doesn't seem to be much sensitivity to the choice, I want to concentrate on method differences rather than data, which might diverge at longer times ago. Data sources are listed here. The sd measure is not affected by different anomaly bases.

So here, as an overview, is the current set of standard deviations, according to the color scale in the key on the right. Red means better agreement.

The best agreement is between the various methods of TempLS, as described most recently here, with an overview here. It is so much better that I have used a color out of the rainbow scale to show it. Differences of the three advanced methods have sd's of about 0.01°C. That is about a third of the nearest difference between indices from different sources.

The next best agreement (0.027°C) is between TempLS grid and NOAA Land/Ocean. I commented,also in 2015, on how NOAA and TempLS grid were eerily close; I showed comparison graphs. That closeness has persisted, and is a reason why I keep posting TempLS grid, which I otherwise think is a very primitive method. So the fact that NOAA is so close makes me worry about that index. But anyway, it is now even closer. I have modified the grid method to use a cubed sphere mesh, which I think is much better than lat/lon.

Almost as good is the agreement between GISS and the advanced TempLS methods. As I shall show, this has improved with V4. TempLS LOESS and Infill have lowest sd, at about 0.031; mesh is a little more at 0.035°C.

The five non-TempLS indices are shown in the top corner. Their levels of agreement are much lower. The Cowtan and Way kriging index has an sd of 0.45 with both GISS and BEST, but less agreement with NOAA and HADCRUT. The best agreement (0.039) is between HADCRUT and NOAA; these have always seemed to act as a separate grouping. GISS and BEST agree about as well (0.045) as the do with C&W. BEST has the greatest disagreement, with both NOAA and HADCRUT.

I posted the data back in 2015, so now I'll use it to show how these concordances have changed. In the following plot, the current sd is divided by the sd reported in 2015. A red value indicates reducing sd (improvement). TempLS LOESS is omitted because it did not exist in 2015.

The biggest changes are associated with TempLS, where methods have improved, particularly with Infill. In 2015 this was a heuristic method, which seemed to give a large improvement. But now I solve a diffusion equation to convergence, which seems to be better again. The sd with GISS is about halved, and is, by a hair, the best agreeing of any TempLS. Because it shifts further towards the other advanced TempLS methods, it moves away from the grid method, and so also from NOAA, which shows as decreasing agreement. The improved agreement (4x) with TempLS mesh is the greatest change of all.

The other marked changes are with BEST. 2015 was still fairly early in its life cycle, and most noticeable is the increasing disagreement with NOAA and HADCRUT. But it also doesn't agree with anything very well.

The other indices, interacting with each other and with TempLS mesh, show little change. T mesh was stable over that period. There is some deterioration of agreement between HADCRUT and GISS, which could be due to the introduction of ERSST 4 and 5, which adjust for the introduction of drifter buoys in SST measurement. HADSST is just bringing out V4 which may implement that.

Here is a more detailed quantification of the changes. There are 9 plots, showing for each index the sd's of the differences with the others (green). Overlaid in transparent blue is the corresponding sd from 2015. For TempLS LOESS, I have used the 2015 sd's of TempLS mesh. Use the arrows below to cycle through the plots.
In the first plot (GISSlo) the TempLS advanced indices (TM, TL, TI) show best agreement, and also improvement (faint blue is 2015). Agreement with HADCRUT is worse. Of the other plots:
  • HA HADCRUT - almost everything is worse, especially BEST. The best agreement is with NOAA and TempLS grid.
  • NO NOAA - not much change, except for lower agreement with BEST. But not a high level of agreement.
  • BE BESTlo - again much increased, and high, disagreement with NOAA/HADCRUT. Otherwise small changes toward more agreement.
  • CW Cowtan and Way - much improved agreement with TempLS; fair agreement unchanged elsewhere.
  • TM TempLS mesh - good and improved agreement with GISS and TempLS grid. Very good agreement with TempLS LOESS and Infill, with Infill much improved (due to Infill method improvement).
  • TL TempLS LOESS - as for mesh. LOESS did not exist in 2015.
  • TI TempLS Infill - very good and improved agreement with TM and TL. Also improved wrt GISS and CW; HA, NO, TG somewhat worse.
  • TG TempLS grid - mostly substantially improved, and not bad, except for BEST and CW. Slightly worse relative to BEST and TI. The good, and further improved, agreement with NOAA has been noted.
Overall, I think it is important to note that even the worse disagreements are not so bad - about 0.075°C. There is a marked tendency to clump, with HADCRUT/NOAA/TempLS grid as one group, and GISS+TempLS(TM, TI, TL) as another, with BEST and CW more loosely attached.

To put the size of these differences in context, they range from 0.01, which I called very good, to about 0.075, which was about the worst. But I did a quick similar analysis between HADCRUT, UAH and RSS. The result is here:

The best agreement there is between the satellite measures, as about 0.1°C. Agreement between surface and satellite is in the range 0.125 to 0.145°C

I have posted the data for this post on a zipfile, with readme.txt, here.

Tuesday, July 16, 2019

GISS June global up 0.07°C from May.

The GISS V4 land/ocean temperature anomaly rose 0.07°C in June. The anomaly average was 0.93°C, up from May 0.86°C. It compared with a 0.067°C rise in TempLS V4 mesh. As with TempLS, it was the warmest June in the record, also by a considerable margin (0.11°C).

The overall pattern was similar to that in TempLS. Hot in most of Europe, extending into Africa and the Middle East. Cool in W Siberia, but hot in the NE. Cool in US/Canada, but warm in S America. Antarctica was mostly cool.

As usual here, I will compare the GISS and earlier TempLS plots below the jump.

Wednesday, July 10, 2019

Revised June global surface TempLS up 0.067°C from May.

I made an error in the previously posted TempLS for June. The rise is now 0.067°C instead of 0,096°C. The June anomaly was 0.782°C instead of 0.811°C. The difference in global average is actually small, and 2019 was still easily the warmest June in the record, but by 0.07°C, not 0.1°C.

Although the overall difference is small, the error was major - my calculation used May SST values, not June. Teething problems with the V4 system. Fortunately, the hemisphere effects more or less balanced, but the map looks different. I had noted  earlier that there was a marked hemisphere difference. I did look further into that, which revealed the error, but I should have twigged sooner. Land is unaffected, and so most of my previous comments still hold.

Here is the revised temperature map, using the LOESS-based map of anomalies.

The original map is preserved on my tweet here.

Saturday, July 6, 2019

June global surface TempLS up 0.096°C from May.

Update - see revision here. The average is down by just 0.03°C, but SSTs were wrong month, and the map now looks different (correct version below).

The TempLS mesh anomaly (1961-90 base) was 0.811deg;C in June vs 0.715°C in May. This contrasted with the drop (0.056) in the NCEP/NCAR reanalysis base index. It was the hottest June in the record, 0.1°C higher than June 2016.

I am now showing TempLS LOESS as the alternative (rather than grid); I think it is about as good a method as mesh. It showed a rise of 0.114°C.

There was a marked global pattern (caused by error - see update), with tropics and SH mostly warm, and the extratropical NH cool, with the notable exception of Europe, which was very warm indeed, and NE Siberia likewise. The mostly cool Antarctica was also an exception.
Here is the temperature map, using the LOESS-based map of anomalies.

And here is the map of stations reporting:

Wednesday, July 3, 2019

June NCEP/NCAR global surface anomaly down 0.056°C from May

The Moyhu NCEP/NCAR index fell from 0.41°C in May to 0.354°C in June, on a 1994-2013 anomaly base. That is the third successive fall since the high point in March, and brings the temperature back to between January and February. Still warmer than June 2018 or 2017, and close to June 2016.

It was mostly cool in N America, warm in Europe (except near Atlantic). Quite warm in NE Siberia/Alaska, and mixed, but mostly cold, in Antarctica.

The BoM ENSO Outlook has been reset from Watch:
"The ENSO Outlook has been reset to INACTIVE. The immediate likelihood of El Nino developing has passed with ENSO-neutral the most likely scenario through the southern winter and spring."

Tuesday, July 2, 2019

Fake charge of "tampering" in GISS

I was commenting on an interesting post (part of a series) at Clive Best's blog. He's been looking at the differences between Hadcrut 3, of about 2012 vintage, and current Hadcrut 4.6. There are some, and I may blog about that. The most obvious difference is that the number of stations in the inventory has nearly doubled. But Clive was focussing on changes to locations that were common to both. I did some analysis, part reported here.

As is apt to happen, there were undercurrents that data is being manipulated for some underhand purpose, and Clive was entertaining the idea that the Pause was being suppressed. Not jumping to conclusions, though, but some were more inclined to. There has indeed been a noticeable increase over those years in the trend during the Pause period. This is overdue, since Cowtan and Way showed in 2013 that HADCRUT's deficiency in Arctic stations was responsible for the difference in Pause trend between theirs and other indices.

Anyway, among dark talk about Hadcrut suppressing the Pause, Paul Matthews commented that GISS had done the same thing, and between 2017 and 2019. This surprised me, because I follow GISS, and compare it with TempLS, and did not know of such changes, which if present would presumably relate to transition from GHCN V3 to V4. Gavin Schmidt has also said that the effect of this was very small.

So I followed Paul's link, which led to a Tony Heller post titled "Tampering Past The Tipping Point". It showed the following plot (followed by many more):

And as usual there, the plot and post seem to have circulated widely. You can see a long Twitter listing here of tweets linking to it. So what is it based on?

As often with Heller's posts, it isn't about what most of his audience thinks it is, but they don't seem to worry about fine points. It isn't the GISS land/ocean (LOTI) that gets widely circulated and discussed. The heading says "GISS Global Land Surface anomaly". But GISS doesn't have a Land Surface anomaly index, unlike NOAA or HADCRUT (CRUTEM). So my first thought was that he was plotting the "Met Stations Only" index, Ts. He has done that before, and the years quoted (2000 and 2017) do correspond, more or less, to what is supplied on the GISS History Page (scroll down to where "Met Stations" appears in the headings). I'll digress a little to explain this index.

GISS Ts index

GISS Ts is no longer shown on the main page, although it did have more prominence in V3. Now it is relegated to the History Page, with the introduction:
"For historical reasons we also maintain a calculation of the anomalies that would result if one only used the meteorological station data. This estimate is not affected by issues in ocean data processing, but because the land is warming faster than the ocean, it has a larger trend than the land-ocean index that is now our standard product. That too has been remarkably stable over the years:"
And with that, they give, as they do with LOTI, a plot of the data as it had been presented at various stages of GISS history, going back in fact to 1981. You can see both plots of the curves together, and their differences from current. And indeed the differences are small, especially recently.

The "historical reasons" are that, until about 1995, there didn't exist a dataset of sea temperatures of anything like the duration of the land record. So when Hansen and Lebedeff in 1987 published the ancestor of the GISS index, they used whatever station data they could get to estimate surface temperature over the oceans as well as land. Islands had a big role there. This index, called Ts, or GLB.Ts, was their main product until the mid '90's, when it was gradually supplanted by LOTI, using ocean sea surface temperatures (SST) as needed, as they became available backward in time.

Update. As CCE notes in comments, with GISS V4, the Ts index is not only relegated to the History page; it is not calculated in V4 at all. The numbers I have used are the latest V3.


However, Paul insisted that there was a land index, and pointed to the Analysis Graphs and Plots page. If you scroll down to the heading "Annual Mean Temperature Change over Land and over Ocean" and open, it shows a plot of anomalies over land and over ocean, and below it gives links to data.

Now this is something different to GISS Ts. It also uses station data, but to estimate the average for land only. All such averages are area-weighted, but here is is just by land area. So from being very heavily weighted, island stations virtually disappear, since they represent little land. And the weighting of coastal stations is much diminished, since they too in Ts were weighted to represent big areas of sea.

The important message here is that Ts and Land are not the same, which I will now show with some graphs. Data is sourced and linked at the bottom.

Recent History, 2017 and 2019

Tony Heller provided a spreadsheet with his post, and it had the GISS data for versions of Ts up to 2017, and the Land data for 2019. I have described details of this here and following. But GISS Ts does of course go to present (May 2019), which is regularly posted here. And you can get past versions of the Land average plots with data on the Wayback Machine - here is version of Jan 2017. So let's look at annual Ts, with 5 year running smoothing:

They are actually very similar. I'll givea combined difference plot later. What about Land?

Not quite as close, but also similar. The main difference is that pre-1900 is warmer in the current version, reducing the trend since 1880 from 1.05 °C/century to 1.0 °C/century. The trend of Ts also reduced slightly. Not much sign of data tampering here! In fact, given the number of extra stations in GHCN V4, there is remarkably little change.

Now I'll plot the Ts and Land averages superimposed on Tony Heller's "tampering" plot. But because the 2017 and 2019 versions are so similar, the plot gets cluttered. To make better use of space, I have truncated some of the big colorful annotations. I'll plot just the 2017 version of Ts and the 2019 version of Land. Not coincidentally, these are the versions of each found in Tony's spreadsheet.

They superimpose exactly! What has been presented as a "tampering" is in fact a plot of two different datasets, representing two different things. To emphasise that, I'll now plot 2019 versions of both Land and Ts:

Also a very good fit. The difference between the red and the green curve isn't "tampering" over time. It's the same difference if you take the current versions. They are just two different datasets representing two different things.

Getting it right.

As mentioned, I originally set this out in comments at Clive Best's site, where Paul Matthews first raised the Tony Heller post. I then noted that at that (Heller's) site, a commenter Genava had observed that the 2019 data plotted was different from the 2019 Ts data, which was the index of the 2001 and 2017 versions. That was on June 27. It got no response until Paul, probably prompted by my mention, said that the 2019 data was current Land data. I don't think he appreciated the difference between Land and Ts, so I commented June 28 to try to explain, as above. Apart from a bit of routine abuse, that is where it stands. No-one seems to want to figure out what is really plotted, and comments have dried up. Meanwhile the Twitter thread castigating "tampering" just continues to grow.


The data plotted are year versions of the GISS Ts Met Stations Only index and the GISS annual data for Land Averages. The sources are, in ascii format:
GISS T2 current (2019) version
GISS T2 historic, includes 2017 version in zip file
Land average current, csv format
Land Average 2017 wayback version, txt

The data I used are in a .csv file here.

Tuesday, June 18, 2019

GISS May global down 0.11°C from April.

The GISS land/ocean temperature anomaly fell 0.12°C in May. The anomaly average was 0.86°C, down from April 0.97°C. It compared with a 0.149°C fall in TempLS V4 mesh. I should note that there were late moves in both indices. GISS' April figure dropped 0.02°C between posting last month and now, making a total drop of 0.13. I note that GISS says that this month is the first using GHCN V4 (other than beta); the change may be part of that. The drop in TempLS is somewhat less than originally posted. I posted that rather early, since all major countries seemed to have coverage, and I got stable results two days in a row. The latter proved illusory, though. A minor nuisance with GHCN V4 is that they keep adding to the inventory. I have resolved to only update it once a month, and this was the first time it had been updated automatically. It didn't quite work; the net result was that my apparent stability was due to data not being updated. It should be OK in future.

The overall pattern was similar to that in TempLS. Warm in most of Asia and Africa. Cool in central Europe and much of the N America (but warm in NW). Arctic warm.

As usual here, I will compare the GISS and previous TempLS plots below the jump. I have now switched from my spherical harmonics based TempLS plots to LOESS based, which gives a more similar resolution.

Thursday, June 6, 2019

May global surface TempLS down 0.177°C from April.

The TempLS mesh anomaly (1961-90 base) was 0.703deg;C in May vs 0.88°C in April. This was a lot more than the drop (0.086) in the NCEP/NCAR reanalysis base index, and takes it back to January's temperature.

The drop was mainly in land temperatures, and central N America was quite cold, as was central Europe, and less so, China/Mongolia. Africa and Brazil were warm, and also the Arctic. SST has been drifting down slowly, but is still warm.

Here is the temperature map, using now a LOESS-based map of anomalies.

And here is the map of stations reporting:

Monday, June 3, 2019

May NCEP/NCAR global surface anomaly down 0.086°C from April

The Moyhu NCEP/NCAR index fell from 0.496°C in April to 0.41°C in May, on a 1994-2013 anomaly base. It was a substantial fall, exactly the same as from March to April, and yet it still leaves May as warmer than February, or any earlier month back to May 2017.

The Arctic was quite warm, including Greenland and west. There was a band of cool from E Canada to California. Also W Europe was quite cool, and also China through to central Asia. Antarctica was mixed.

The BoM ENSO Outlook is back to Watch - still an enhanced likelihood.

Friday, May 31, 2019

GHCN V4 Monthly temperature data displayed on an active sphere

This is another in my series on a close examination of the GHCN V4 database of station monthly average temperatures. The previous post described a system for displaying time series graphs and numerical data for named stations (with search). This post draws attention to an older system, where a triangle mesh is shown for each month, with the reporting stations as nodes, and shaded according to temperature anomaly (TAVG) for that month. The shading is such that it is correct for each node (with complications where some are nearly coincident), so it reveals a lot about the low-level structure of he data. There is enough homogeneity that the overall pattern shows through. I had thought about replacing it with a LOESS-based map, which would actually be a lot easier, and is better for the overall picture. I'll probably put that up at some stage, but I think the low-level info of this plot justifies its place.

The page for this facility (link on right - "WebGL Map of...") is ancient, more or less for the duration of GHCN V3. The original post from 2012 is here. The plot is now based on the Moyhu WebGL facility. That means that as well as being able to drag the plot around as a trackball, you can click on the small map for quick centering. The overlay of mesh and points can be toggled off and on with the checkboxes beside the small map. The table beside that lets you choose a month. With the top bar you choose the decade, next the year, and then the month. When the right date shows below, click "Fetch and Show". When the right data is showing, the date will be on top left. There are now zoom buttons, as well as zooming but dragging vertically on the globe with right button down. The Subset button lets you speed up response by choosing a subset, but I don't think you'll need that here. You may notice that I have shifted from Rainbow to GISS-like colors and ranges.

When you click on the sphere, it shows the code name, name and anomaly of the nearest station. The anomalies are as calculated by TempLS V4 mesh, and the base period is 1961-90. The color scale is centered to the mean for the month, so it varies as you go back in time.

I'm using unadjusted data, and there are a number of stations that stick out as different. I actually developed the viewer to see what is going on there, and I'll write about that next.

Wednesday, May 29, 2019

Viewer for GHCN V4, with search and active graph

I have made an active viewer for the GHCN V4 database of station monthly average temperatures, as used in TempLS and other indices. I was prompted to do so after adapting the GHCN webGL page to use GHCN V4. I think the results are generally good, and I'll write more about that soon. But there are a few stations that stand out as having very odd anomalies. The viewer shows adjusted and unadjusted temperature data for each of the 27380 stations in current GHCN V4. I'll link the page more permanently through the data portal page on the right panel.

The main viewing mechanism is an active graph, from which you can download station data. The pink rectangle below the graph (below) has a search box. You can put in a part of the station name. The GHCN 4 convention is that stations are in all caps, with underscores replacing spaces. The search will return up to 20 names; each line giving the full name (to 18 chars), the GHCNv4 station number, and the approximate year range of the data. To the left of each line is a radio button. Clicking this will download the data for that station and show an active graph.

Below the graph is a table of checkboxes with names of months. There is a column of unadjusted, and one of adjusted. Check the boxes to determine which graphs are shown. There is a legend box below the graph, which shows the selected names and color. You can drag this onto the polot to act as a legend.

The graph is active (like this one). That means that you can drag with mouse the location of the plot. If you swipe horizontally below the x-axis it will shrink or expand the x-range, and similarly behind the y-axis. So you can adjust the plot to see best what you want.
Update If you don't like a color you can change it. Just click the button of that color in the legend to randomly choose from a palette of 64.

The Data button above the checkboxes creates a new window with a printout of the data that is showing. You can save this, or copy and paste.

I should note that I have excluded any data with a quality flag; sometimes this is quite a lot. Anyway, here is the plot:

Tuesday, May 21, 2019

Higher resolution graphics for monthly surface anomaly.

Update: Following suggestions in comments, I have made a new tableau in which the new LOESS plots are compared not only with MERRA but with the ECWMF reanalysis ERA5. I think that where LOESS and MERRA disagree, ERA5 is often in the direction of LOESS. But you be the judge. There are the same 12 months of 2016 in the set.

I have for several years posted a monthly plot of global surface temperature anomalies, calculated by TempLS, and followed up a week or so later comparing it with the corresponding plot from GISS NASA. I use the same style. I chose GISS because it is, IMO, the best of the major organisation plots.

The TempLS plot uses spherical harmonics (SH) to give a smoothed stable plot to anomalies defined at scattered points. It has worked well, but the smoothing restricts to contours of a rather limited curvature, and of course restricts resolution. There is also an issue, noted in my previous post, that the spatial distribution of the anomalies calculated on a fixed interval, like GISS 1951-80, does not quite correspond to mine, which are derived on another basis (least squares fitting) and adjusted to the time interval by adding constants that make the means match. Ensuring that correspondence has some minor effects.

I have a new method of averaging using LOESS on a regular array of nodes derived from a regular icosahedron. This also makes possible a plot of the anomalies based on this local averaging which has apparently higher resolution than the SH based plot. The problem with SH is that resolution is limited by the areas of weakest coverage, while LOESS is truly local, and gives good coverage where the coverage is good, which for GHCN V4 is most of the world. I want to show that the apparent higher resolution is real by comparing with NASA'a MERRA reanalysis, which is posted to approx 1/2 ° resolution.

One might ask - why not just use MERRA? I don't present MERRA as the gold standard of accuracy. But in any case, it isn't available with the immediacy of surface data. I got the MERRA data from KNMI Explorer, where it is current up to end 2016. So my comparisons are for that period.

State of the art

The traditional style of mapping anomalies is to just show the lat/lon grid used colored by temperature. Here you see HADCRUT and NOAA:

In fact HADCRUT has been smoothed and is actually not such a bad plot, except for the white blanks. The NOAA plot (actually land only, from here) is mainly cited by contrarians to claim poor coverage. NOAA assists them here by posting a map from early in the month (Jan 11 here) and then not updating as more data comes in. But both have a similar failing. They treat the arbitrary grid segments as telling something about the data. But they would tell something different if a different grid size was chosen. You don't suddenly become ignorant about temperature because an arbitrary grid boundary has been reached. NOAA seems to realise this, because their corresponding land/ocean plot looks like this:

They have done a lot of infilling. I've never seen an explanation of the basis used, but here I mainly want to focus on resolution, which is still poor. The corresponding GISS plots come with two stated interpolation lengths:

On left is 250 km, right 1200km. The left should have higher resolution, but is again marred by grey blocks where it is deemed that information ceases at a cell boundary. The right fills more area with more smoothing, and is the one I use, though it still has some grey.

The new method, compared

Details of the new method are:
  • The anomaly data is first converted to the 1951-80 base, using the distribution described here.
  • An array of 5762 23042 regularly spaced nodes is created on the Earth's surface, based on dividing the regular icosahedron, with a transformation on each triangle to preserve equal area triangles (described here).
  • A first order LOESS (local regression) based on nearest 20 stations is used to interpolate each month's anomaly data onto these nodes (described here)
  • Using the regular icosahedral grid, those data are then interpolated onto the image of a 1° lat/lon array, projected onto the sphere
  • The data thus interpolated is then used to made a 2D lat/lon plot, using, as I usually do, the GISS temperature levels and colors.
I do this for each month of 2016, and compare with
  • The two graphs I usually post - GISS 1200 km and TempLS Spherical Harmonics, both 1951-80 base, with TempLS converted regionally
  • the new method is bottom right, and on right, the MERRA plot (data from KNMI), also converted regionally from 1981-2010 to 1951-80.
With the buttons at the bottom, you can cycle through the months of 2016. December displays first.

The LOESS bottom left plot looks to have more detail than the top plots. See, for example, the US plot. To test whether the appearance of resolution is real, compare with the adjacent MERRA plot. I'll mention these points in the December plot:
  • First the weakest point - LOESS shows a big cold spot in Alaska, which is not really in either MERRA or GISS. However, if you look at the 250km GISS plot above, there is a cold spot there too.
  • For Canada and ConUS, the extra detail of LOESS does seem to align with MERRA
  • The extra detail of the hot spot over NW China seems to line up
  • MERRA has a lot of detail about Antarctica, where GISS is very patchy. TempLS SH is sounder there, but LOESS moves a little more in the direction of MERRA
  • LOESS, like GISS and MERRA, has an El Nino jet, where SH is rather wavery.
You can check for other points here, and in the other months of 2016. But I think in almost all cases, where LOESS has extra detail, it will correspond to something in MERRA.

I plan to use the new graphics scheme for the WebGL page for past monthly anomalies. That page is currently hard to maintain because GHCN V4 keeps introducing new stations, which means the data does not match the stored meshes. LOESS will fix that, as well as being better.

Update - including ERA 5

In comments, Bryan Oz4caster and Zeke Hausfather recommended using the ECWMF reanalysis ERA5, perhaps in preference to MERRA. I certainly think it is a good idea to have another reanalysis, so that where TempLS disagrees with MERRA, which if any is wrong. So I downloaded ERA5 from KNMI, and made another tableau below. This time the two reanalyses are on the right, and I have duplicated TempLS LOESS on the left. For ERA5 I have used 0.5° grid spacing, similar to MERRA.

My general impression is that MERRA sometimes seems overly dramatic, and ERA5 is more in line with LOESS on such occasions. Looking at December, the cool in western Canada is similar between LOESS and ERA5. The blobs of warmth around Mongolia in ERA5 and LOESS are similar; MERRA a little different. But there still isn't any support for the cold spot in Alaska.

Monday, May 20, 2019

Changing anomaly base in spatial plotting.

This post tries to give a more exact treatment to comparing global temperature graphs with different anomaly bases. It is preparatory to one where I try out a new style of graphics which I think has better resolution than the spherical harmonics (SH) based graphs that I use now, and compare with GISS. I want to compare with high resolution graphics from reanalysis, but first there is something I need to improve.

Temperature anomalies are created by subtracting from each reading of temperature an estimate of the normal, or expected value, for that time and location. Various estimates can be used, as long as they are consistent. Often used is a temperature over a three-decade interval. I use a least squares method, but normalise it to zero average over the 1961-90 period. That last step, though, is an adjustment at global level.

Comparing anomalies with different base periods is normally done by aligning the global average over those period. This works well at that level. But it does not align the anomalies in different spatial regions, where the local averages may have changed differently in those periods. In comparing GISS with 1951-80 basis with TempLS, with actually a least squares basis, I have just added an adjustment constant for each month. I set out the process here, giving a table of changes to make for all the popular conversions. But more is needed for spatial alignment, although the actual effect on say a month anomaly plot is small.

Practicalities - spherical harmonics again.

However, it isn't obvious how to do that. The problem with three decade bases is that not all stations have data in those times. That is why I use least squares. To make a comparison between two periods, you would, in principle, have to limit to stations where you can calculate averages in both periods. Even if that were possible, it would be a pity to have to in effect redo two temperature anomaly constructions just to change base.

But the change that matters is what shows on a spatial plot, and rather than do every station, it is sufficient to approximate the difference with spherical harmonics. This is the method I normally use to show TempLS monthly maps. Since the issue is what the regional temperature difference actually is, rather than the individual measurement methods, it is sufficient to work out coefficients by just one method. I naturally use TempLS.

Method and numbers

I first calculate a spherical harmonics fit for the TempLS anomalies for each month since 1900. It is like a Fourier fit, but using regression. The integration method is the mesh method. I specify order 10, which actually gives 121 functions in the SH basis. For each order n, one of the functions is just cos latitude multiplied by sine of n times the longitude, so this gives a measure of the spatial frequency. More details of spherical harmonics are here.

Once the SH approximation for each month is made, there is no further reference to which stations are reporting. Since the operations are linear, for each tri-decade we can just average the 121 coefficients, and then difference to give the change between bases. Then to regenerate the temperature map, it is just necessary to compute the values of the harmonics at whatever points are being used to make the map, for example a lat/lon grid, and then combine these using the difference coefficients.


First, here is a map of the TempLS values for mean April between 1961 and 1990, plotted with negative sign. It would be similar for other months, since it mainly represents changes over a thirty year period.

The mean should be approximately zero, since TempLS was set to have mean zero in this range. It isn't quite zero, because the mean represents the SH-enhanced mesh integral. It may not be obvious from the plot (in GISS intervals and colors) that the mean is zero, but the area of the very warm part is small relative to the large areas of S Hemisphere cool.

This plot gives an idea of the magnitude of differences to expect. They can be large. Now I'll show the practical effect, for the most recent month, April 2019.

The top two maps are those I displayed. Left is the SH representation of normal least squares TempLS, and right is the GISS plot, which is based on actual means of months in 1951-80. So below it I have put the plot of TempLS calculated on this basis, as described above. Bottom left is the map of expected corrections in going from TempLS to a true 1951-80 base, less the spatially constant offset that was used.

There isn't that much difference between top left and bottom right, or GISS for that matter. But the bottom right is closer to GISS in some respects, and that can be explained by the bottom left. The cool spot above NW Canada is much reduced at bottom right, in better agreement with GISS, and that is the effect of the warming correction shown bottom left. The cold of the W Sahara almost disappears, as it does in GISS. Bottom right shows the warm correction there. Same in Saudi Arabia. Finally the hot spot in NE Siberia is made even hotter, in broad agreement with GISS, although the shape there is slightly different. Generally the ocean corrections are too small to notice.

I'll use the corrected versions in new monthly reports. But mainly I am setting this out because I am planning to use what I think will be a better resolution map. I will show that, in the next post, in comparison with reanalysis, and for that I need the extra accuracy.