I've recently posted a map (one of a series) of monthly temperature anomalies for individual stations. I've been thinking about what kind of anomaly is really appropriate here.
Some skeptics don't like anomalies, and say only real temperatures should be plotted. But then the plot is dominated by the variations in altitude and latitude. In January it's cold in Moscow and warm in Booligal. We knew that. If you hear that it was 15°C in Rome last month, you'll ask "but what is it normally?".
You need the anomaly, because that's the real information in the month's readings. And a plot should show that. The anomaly is the difference between what is observed and what you expect.
But what expectation? More below the jump.
Indices like GISS and HADCrut use a thirty period to calculate averages on which to base anomalies. That's the expectation, so deviation from it includes global warming. That has to be related to a fixed period.
I used that for the map anomalies. There is a practical difficulty that a station with October 2012 readings and a substantial history may not have enough information in, say, 1961-1990. This is the period that I used. So a reasonable thing to do is to use other information and get a regression estimate for 1975. That will avoid bias from a warming trend.
What do we really want?
The idea of that was that the anomaly will include global warming since 1975. And indeed, recent anomalies are mostly positive. However, this isn't obvious, because by graphing scheme shifts the color map relative to the range. So because of warming, small positive anomalies are shown with bluish colors. That would be the same whatever base period was used.
Looking at a monthly map, global warming isn't news. Even relative warming like that in the Arctic isn't new. Seeing a reddish Arctic month after month may not be what we need. Because it's all pushed into the upper color range, there isn't much new information.
ConsistencyOne thing that I think is important in these plots is that you get an idea of spatial consistency. Where it's hot, most stations nearby are hot. The colors are fairly smooth. This is only true if the anomaly base is also consistent.
There is a station Nitchequon, in NW Quebec, which shows up with consistent low anomalies relative to neighbors. Otherwise Canada has mostly good consistency. I suspect the anomaly base is wrong. Nitchequon has a fairly long record, including quite a lot in 1961-90, but is missing many years from 1985 to 2005. Temperatures after the break are much lower than before. The adjusted version moves these later numbers way up. I'm using unadjusted GHCN. That's not so important in absolute terms, but, unadjusted, it does produce the marked dip in the plot.
Incidentally, I think the Nitchequon story does show how inhomogeneities can really stand out to be identified.
Expectation?What does the expected value really mean? I could produce a value that allowed for ENSO, solar forcing etc. This might well be a lower variance estimate. But I think most users would expect to see those effects reflected in the anomalies, not removed from them. So there is a middle ground to be found.
My current thinking.I think that I should plot anomalies relative to the current mean values (for month) with adjustment for trend. That would be the expected value. It has the advantage that it would avoid issues with past jumps, as at Nitchequon. And it does show the information that is new with each month.
I think the best way to do it is with a weighted least squares fit to a linear model, as with TempLS. I'd fit a model for each station:
The L's are offsets constant for each month (m) ("monthly averages") and J is a linear progression over years (y). The weighting would be an exponential decay back in time, with a time constant of maybe thirty years. This would give higher weight to recent data. The anomaly would be the residual.
I'll think about it a bit more, but I'll probably redo the data for the previous post.
Update - it has now been done as described here.