## Monday, June 30, 2014

### ## Infilling, climatology and anomalies

There's been a lot of grumbling lately about USHCN. For some reason Steven Goddard has gone viral, and has been on some right wing media. Judith Curry gives a summary, with links.

WUWT has had a varied role. Initially Anthony Watts wrote with Zeke some good posts on some flaws in SG's methods. Zeke has continued at Lucia's. I chipped in too.

Then it took an odd turn. Anthony got invested in disputing SG's claims, only moderately exaggerated, about the number of USCHN stations that actually reported each month. When he found that there were quite a lot, "zombie" stations became the enemy. And with that, infilling.

The background is that USHCN tries to do something that no-one else does - to give an average (for the US) in absolute °F (absolute=not an anomaly). That can be done, but needs care (unlike here). USHCN does it by ensuring that every month has an entry for each of its 1218 stations, which ideally never change. But in fact some do become defunct. It's up to 20-30%. So for them USHCN just estimates a value from neighbouring stations, and proceeds.

So that is the latest villainy. I think USHCN should use anomalies, and I suspect they in effect do, and just convert back. But there is nothing wrong with the infilling method. I've been arguing in many forums that the US average, for a month say, is a spatial integral, and they are doing numerical integration. Numerical integration formulae are usually based on integrating an interpolation formula. If you first interpolate extra points using that formula, it makes no difference. Any other good formula will also do.

I don't have many wins. So I thought I would give a simple and fairly familiar example which would show the roles of averaging, climatology, infilling and anomalies. It's the task of calculating an average for one year for one station. Since it's been in the news, and seems to be generally a good station, I chose Luling, Texas.

Update. From comments, I see that I should emphasise that I'm not, in this example, trying to calculate the temperature of the US, or any kind of trend. The issue is very simple. Given a temp record of this one place, and the 2005 monthly averages (with a missing), what can be said about the annual average for 2005  for that place.

Update: I have a new post with a graphics version here

To simplify, I'll round numbers, assume months of equal length. All data is raw, and in °C. Climatology for each month will be simply the average of all instances, and the anomaly is just the difference from that.

So here is the basic data for 2005, where all months are available:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 12.8 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 20.3 Climatology 10.3 12.1 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 20.2 2009 Anomaly 2.4 0.7 -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0.1

Now suppose we decide too many days are missing in February, and it has to be dropped. And suppose, as WUWT seems to want, we do just that:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 NA 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 21 Climatology 10.3 NA 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 21 2009 Anomaly 2.4 NA -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0

So the annual average has risen from 20.3°C to 21°C. That's a lot to follow from removing a month that wasn't unusually warm (for Feb). But if you look at the next line, most of that is accounted for by the change in climatology. It's average has risen by the same amount.

Let's note again that the anomaly average has changed only a small amount, from 0.1 to 0. That reflects that the omitted month was warmer than normal, but is a proportionate response. That's the benefit of anomalies. There is no climatology to make a spurious signal.

But we didn't want to change the annual climatology. That isn't supposed to change, at least not radically, from year to year.

Another way of seeing why just dropping is bad, which I find useful, is that ou can always replace the NA with the Ann average figure. That can't change the average. So this is exactly the same:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 21 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 21 Climatology 10.3 21 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 21 2009 Anomaly 2.4 0 -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0

Infilling Feb with 21°C is obviously bad. And it shows up in the climatology. But that is what just dropping does.

Now suppose we infill, rather crudely, replacing Feb with the average of Jan and Mar. You know, fabricating data. We get:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 14 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 20.4 Climatology 10.3 13.3 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 20.3 2009 Anomaly 2.4 0.7 -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0.1

It's not a particularly good infill. But it is effective. The annual average has risen from 20.3 to 20.4. Much better than 21. And the climatology has changed, by the same small amount.

This is basically how USHCN could handle the loss of a month without losing anomalies. In fact, they would take steps to adjust for the known climatology error, to get a better infill. But an even simpler way! Just infill the anomalies, and add to the climatology:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 12.8 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 20.3 Climatology 10.3 12.1 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 20.2 2009 Anomaly 2.4 0.7 -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0.1

That actually worked artificially well, because the Feb anomaly infill happened to be almost exact. But if you really really don't like infilling, setting the Feb anomaly to zero would do nearly as well. Or to the anomaly average without Feb.

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann 2009 12.8 12.1 15.2 18.7 23.2 27.6 28.8 28.9 28.2 20.2 17.1 10.3 20.2 Climatology 10.3 12.1 16.3 20.3 24.2 27.7 29 29.1 26.2 21.2 15.4 11.2 20.2 2009 Anomaly 2.4 0.0 -1.1 -1.6 -0.9 -0.1 -0.2 -0.2 2.1 -1 1.6 -0.9 0.0

Well, this is an analogy of how USCHN can average stations over a month. But one last thing - we don't even have to average the climatologies each time. We can just average the anomalies and add to the annual climatology.

So the moral is
• Infilling didn't hurt.
• What did hurt was omitting the climatology part of Feb. That is because Feb is known to be cold. Just omitting Feb is bad. Infilling absolute temp gave a reasonable treatment of the climatology.
• But dealing with anomalies alone is even better.
• And not "fabricating data" is by far the worst.
Let me put it all yet another way. We're saying we don't know about Feb, because maybe half the days have no reading. But what don't we know?

"Throw it out" means we don't know anything about Feb. It could have been 10°C, 20 or 30. But we do know more. It was winter. In fact, we know the average temp. And our estimate should at least reflect that knowledge. What we don't know is the anomaly. That's what we can throw out.

By "throw it out" again we're really replacing with an estimate, even if we don't say so. And the default is the average of remaining anomalies. But we could also just say zero, or the average of neighbours (infill). It won't matter much, as long as we don't throw out what we do know, which is the climatology.

### Weighting

OK, I might as well work in another hobbyhorse. I've said in comments that infilling is harmless when averaging. It actually just reweights. Suppose you have the Feb data. The average us just a weighted sum, each month weighted 1/12. What if you don't know Feb, so you replace with an average of Jan and March. the annual average is still a weighted sum of data. The weights are:
1/8,0,1/8,1/12,1/12...
It's still an estimate based on the same data. Jan and Mar have been upweighted to cover the gap left by Feb.

The dropping of a single February should knock out the anomaly value for the whole month, is it?

What would be the effect of a dropped February in a network of stations that do not have a dropped February?

The rest of the post is circular reasoning at its worst.

1. OK, how would you handle the dropped February? What annual average would you get?

There is no network of stations in this example.

2. Like I said, the example is the limitation. One station cannot have a 'climate' signal can it? Climate and climate measurement are for points in a field.

The circular logic is here: "Just infill the anomalies, and add to the climatology". This is clearly a wrong step if the objective is to infer climatology back again from the infilled values.

1. Shub,
No, the objective isn't to infer a climatology. It's just to get an average T for the year, given the months with one missing. It's a commonly performed task; just think of working it out for your home town. You seem to be thinking several steps ahead; my question is just, how would you do it? It could be anything; I've chosen a climate flavor because we are familiar with seasonal variation. The essence is that trhe 12 numbers have different expected values, which we can estiamte independently.

2. Nick,
I'm really troubled by what you've done above. If you don't know what happened in February, you certainly aren't going to find out by any manipulation of the information you do have about the other months. I had thought the more effective method for inferring the February temperature was by finding nearby stations which not only have February but whose anomalies closely track what happens at this station. I think that this is what the Best people do.

You still get a fabricated number but it least it is fabricated from other February data rather than what happened the rest of the year.

I suspect that what you've done is show by example what the infill idea is, without getting into the complexities of how they really do it. I certainly hope that the keepers of the flames aren't doing it this way. If they are, then their methods don't pass muster even with my meanest understanding.

3. jf,
"If you don't know what happened in February, you certainly aren't going to find out by any manipulation of the information you do have about the other months."

I've added a bit at the end of the post that may be relevant here. You actually know something about what happened in February. You know the climatology. In particular, you know it wasn't 21°C, which is the implied value if you just drop the reading from the average. What you don't know is the anomaly.

I've tried to restrict to just one site, so using other sites isn't an option. But if it was, it is also infilling. I'm actually thinking of the general problem as interpolation in space-time (post coming). Two space dimensions, one time. You can use information from readings adjacent in space or time. They are no different in principle.

Anyway, the point of a worked example is that you can test. Just dropping raised the temperature by 0.7°C. For an individual month to have that effect on an average is a lot.

The thing about infilling with adjacent months is that that respects the climatology, while not preserving it exactly. So it's not bad. But you can go further, and work out how much the average of Jan and March climatology differes from FEb, and add that as an adjustment. Remember, I'm using this as an analogy for the spatial infilling, and with that, I think that adjusted approach is what USHCN does.

4. jf,
I've added a section in red at the end of the post. I'm working on some graphs to illustrate. It addresses the "fabrication" issue. When you infill and average, you aren't creatin g new data. You are just reweighting the data you have.

5. Nick,
I apologize for asserting fabrication - that's clearly not the problem. Maybe it's like a team with a couple of players who've been sent off (red carded is it?). the guys who are left can infill as best they can, but it's not the same. I don't have any problem with concept of infilling, but the method has to be the best possible. I do think fudging the annual average from the other 11 months is worse than using a hopefully narrowly defined neighborhood anomaly range.

Remember I'm naive so if I've tripped into any terms of art, i intended the words in the naive sense.

6. Nick
I am not getting ahead. The average temperature of a point (station) are the result of forces that act over a region over long timescales (of decades and more). What other meaning does the average temperature of one's hometown have? What you are asking for in essence is to isolate the mathematical derivation of annual averages from their implications. The average temperature of a point is however not a statistical property alone.

I have used SPSS functions to fill in values for the simple reason that some calculations do not tolerate missing data. This was purely a methodological step.

7. j ferguson
"I apologize for asserting fabrication - that's clearly not the problem."
That's OK. I need to deal with it though, because plenty of people do. And fabrication was once a perfectly respectable word.

Shub,
"What you are asking for in essence is to isolate the mathematical derivation of annual averages from their implications."

Yes, indeed. Getting the maths right is basic. "Just dropping" may have heartwarming implications, but gives the wrong answer.

It's true that you probably want to know about annual averages to discern a trend. But they have to be right. The cautionary tale is the Goddard spike.

"I have used SPSS functions to fill in values for the simple reason that some calculations do not tolerate missing data. This was purely a methodological step."
And so it is here. The USHCN method does not tolerate missing values, probably for similar reasons that your SPSS wouldn't.

I'm getting repetitive, but every averaging process has an implied infill value. You can either shut your eyes and accept the default, or work out the best estimate you can. WUWT is a big fan of shutting eyes.

I've put up a new post with the graphics version.

8. Nick, the average temperature of a year is not a mathematical/statistical problem. It may appear attractive to start with a single station and work upwards in the way you are attempting. But the real situation is already different. The points (stations) exist in un-isolated fashion a priori in a field. The property you are trying to measure acts on fields not points. They act ever timescales of 30-40 years. These are your starting assumptions.

What you are doing is to show something self-evident: dropping a value when obtaining numerical averages of a cyclic process gives a significantly different answer. But what of a single deviation in a n=40 sequence or vector? The mean will be insensitive. It starts to matter when you put these synthetic numbers through your spatial interpolation grinder to obtain a field average and there are missing values *every year* of the 40. But by your method, at that time, you would have justified infilling prior to completion of the steps needed to produce a field average and you end up with a large proportion of synthetic temperatures in your dataset.

9. Shub,
"dropping a value when obtaining numerical averages of a cyclic process gives a significantly different answer"

It isn't particular to cyclic. It happens whenever the expected value of the reading is significantly different from the expectd average of the group. Because it is the latter that is effectively used when you drop data.

"The property you are trying to measure acts on fields not points."
The typical field task is to average over a heterogeneous spatial region at a point in time. It could be a grid cell, region or globe. A year is a field in time.

"you put these synthetic numbers through your spatial interpolation grinder"
There is no reason to use the generated numbers to further interpolate. As I hope I've made clearer in my next post, the "synthetic" numbers are just linear combinations of existing data, and when it goes into the average, which is also a linear combination, all that happens is that in the end, the weights are changed. And they change in a way to more exactly weight for the area.

It's an analogy, but the way it works in a 2D spatial integral is just the same.

3. If I understand what you're suggesting, it's essentially that trying to produce an annual average temperature (rather than an anomaly) is problematic because there is a large variation across the year and hence ignoring data for a particular month (for example) could have a large influence on the average value. Hence some kind of infilling is required. Given that - I think - this is not as big an issue for anomalies as it is for the actual temperature, do you think NOAA has created a bit of a rod for their own back in trying to do this. I can see why it's useful, because many may not quite understand what an anomaly is, but trying to produce robust estimates of annual averages without some kind of infilling is probably impossible. Hence they open themselves up to these kind of criticisms from those who don't really understand (or don't want to understand) the complications in trying to produce such results.

1. Yes, I do think NOAA shouldn't do it. But I think there is a lot of history there, and it seems that in the US, it can be hard to change things, especially with a Federal-State aspect. And incredible thing about the USHCN processing is that there is a difference in treatment between pre- and post 1931 data. Why? because the process was centralised in 1931, and they have to respect the way that States handled the data previously.

One thing I'm trying to illustrate is that you can use anomalies without saying so, because you can put everything back together at the end. And I think that is probably what their method amounts to. But the key thing is that you have to preserve the climatology component of the calculation. That's why they keep the "zombie" stations. It's just an arithmetic device, and a harmless one.

2. It changes trends. It isn't harmless.

3. Thanks. I somewhat missed your point about the anomalies, but I see what you mean now. While I'm commenting, do you understand why in the WUWT post about July 1936 being the hottest month again, the data in the 2012 screenshot is shifted up compared to the data now? I can see how this could happen with anomalies if one were to change the baseline, but can't quite understand how it can happen when using actual temperatures.

4. On 1936 and all that, I don't know. There was a version change, from v2 to v2.5. That affected a lot of things in the way they used GHCN Daily, for example. I think it also affects the absolute level, in terms of the convention about assuming present is the standard and everything is adjusted relatively.

But some of the adjustments do vary too. TOBS is a big one and depends on what is known about changes to observing time. There's no set database for that, and I'm sure they keep finding new stuff; correspondence etc.

4. Nick,

Slightly OT but what the heck.

FYI, two papers that provide independent verification of global warming (both published in 2013);

Global warming in an independent record of the past 130 years

http://thisse.1x.biz/docs/Anderson_2013_GlobalWarmingInAnIndependent%20RecordOfThePast130Years_GRL.pdf

Independent confirmation of global land warming without the use of station temperatures

http://www.leif.org/EOS/grl50425-global-temps.pdf

1. Since 1945 is when man-made CO2 really started to climb, it appears the trend from 1910 to 1945 was higher than 1945 to 2013. So much for CO2 as a driver.

2. CO2 was increasing in the earlier time too. But the fact that other things might cause warming doesn't mean that CO2 can't. And while those other things may come and go, CO2 will just increase and increase.

5. One problem to begin with. There are only 50 USHCN stations with 360 non-Estimated values from 1961-1990. So your baseline is compromised. As is your anomaly.

Second. If you go ahead with the infilling, USHCN changes it pretty much every day. I think about 20% of the Final data changed from 1 day to the next when I Iooked.

Third. Infilling tend to reinforce trends. Just infill Feb = Jan + Mar / 2 would probably not reinforce trends because it is not based on anything but the stations Jan and Mar value. It wouldn't be based on all the other nearby stations and their trends.

1. Anomaly methods generally don't require 100% complete records during the baseline period to effective remove the long-term seasonal climatology. Generally speaking, as long as you have more than 20 years of reported data in a 30-year baseline period you are probably fine.

Also, infilling has zero effect on the trend by definition, since trends are calculated over regions, and the act of turning single point temperatures into regional estimates requires some sort of spatial interpolation that mimics the effect of infilling.

2. USHCN infilling changes trends.

There is no 30 year climatology. That's a myth. 1979 to 1998 is up while 1998 to 2013 is down. The idea that there is a valid average temperature for infilling is a myth. Every year is different. Some warmer. Some colder. The trend can change from one year to the next.

The ONLY infilling I would trust is the months on either side of a missing month. Unfortunately USHCN has stations with long stretches of missing months so you can't do it.

Gridding barely changes the trend and it doesn't change the ratio of raw/tob/Final.

3. "So your baseline is compromised."
I think you're over complicating it. I haven't used or mentioned a baseline. I've separated into anomaly and climatology (expected value), but any of the various ways you could estimate that will give a big improvement.

My point here is simple. Just doing what seems to be the simple thing (leave it out) introduces a big error. It raises the annual estimate by about 0.7°C. By focussing on the way we treat the known variant part (seasonal climatology) and the new information (anomaly), we can get an order of magnitude improvement in that. Infilling is one way that gets that benefit.

But another thing I'm pushing is that the idea that "if you're uncertain, don't hazard an estimate" is an illusion. Whatever annual average you come up with implies an estimate for the missing, which here is Feb. That implied estimate is the average of the data that you have. And for Feb that's a bad estimate, because of seasonality. Any estimate based on your knowledge of climatology will be better. A local infill (winter) will respect that climatology and be better.

4. Climatology = baseline.

How many February's in the Climatology for that one station. 30 if you are exceptionally lucky. More likely 29 or 28.

Which would be more accurate ... infilling with Jan+Mar/2 or the average Feb from 1961-1990?

And lets say it was 2012 ... why go any further than Jan+Mar/2?

But then what happens if Feb and Mar are missing?

Better to just put a * in the Annual Value and Feb. Or -9999.

You can still use the station for the regional monthly values for the other 11 months.

6. Well it's not like Steve Goddard has been outright lying about the USHCN network all that long ...

Oops, never mind.

1. https://sunshinehours.files.wordpress.com/2014/06/ushcn-v2-5-0-20140627-tmax-dec-gridded-1x1-1895-2013.png

2. Everett, I'm not sure what you mean there. But I do see a recurrence of the annoying idea that adjustment is an error and so here should contribute to the error bars.

Suppose you have a thermometer that works well enough, but the scale is wrong. So you recalibrate, and that can be done quite accurately. You still read from the scale, and make an adjustment. The size of the adjustment is not a measure of any kind of error in your result.

3. Nick,

My nickname is Junior as in EFS_Junior.

Before I banned myself from WTFUWT, everyone over there just called me Junior, like it actually meant anything.

I'm sort of doing a Poe of Goddard, which is sort of SOP for him, so in keeping with that meme ...

"But wait, see below, I'm now in the process of infilling dummy years to both datasets, then I can subtract monthly values, find the maximum difference, I'm quite sure that I

can show a 90%+ return of at least one month being different by 1 degree F from all 1218 stations."

http://moyhu.blogspot.com/2014/06/ushcn-tempest-over-luling-texas-theres.html?showComment=1404077239604#c1018019251914211430

Well the results are in using these datasets (somewhat dated (21-May-2014) will do current (6/29/2014) USHCN from their ftp website shortly);

http://cdiac.ornl.gov/ftp/ushcn_v2.5_monthly/

OMFG, all 1218 USHCN stations have at least one monthly value (+ or - deviation from homo) of at least 1.2 degrees F.

1206 USHCN stations have at least one monthly value (+ or - deviation from homo) of at least 1.8 degrees F (1.0 degrees C).

That's like 99% (or 100% for Goddard's use of say 1.0 degree F) of the current USHCN network.

There's really no "random" about picking/comparing homo vs. raw. Any old USHCN station will do.

So, in fact, any lying denier can pick ANY USHCN station to complain about, like Goddard has been doing for the last 4 years.

Median one single month (-) deviation is -4.4 degrees F (homo below raw).

Median one single month (+) deviation is +2.7 degrees F (homo above raw).

So what are lying deniers like Goddard, et. al. likely to do?

What they are all doing right now, for whatever reasons those fake skeptics don't like homo temperatures and homo anomalies and homo trends and homo infilling and homo

climatology and homo ...

:(