Following my report of my integration of the NCEP/NCAR V1 reanalysis during March, commenter Steven D perceptively noticed a discrepancy with the ESRL PSD corresponding numbers. I'll talk more about these, as I hadn't been systematically using hem for checking, and they are useful. Anyway, the discrepancy wasn't huge - 0.782°C instead of 0.755, but it wasn't just integration error. It was natural to suspect a leap year problem.
I checked, and found that the program seemed to be doing everything right there. But then Steven D pointed out a more subtle issue - anomaly calculation. And that was it. I calculate daily anomalies, actually for each grid point, but it would be the same issue for the spatial average. To get a climatology, I average each calendar day for the 20 years from 1994-2013. I choose that period, because before 1994 there are, or were, gaps in the NCEP/NCAR sig995 temperature data. Since there are only 5 Feb 29's I omit them, so there is a 365 day climatology, which was fine in 2014 when I started. Then when I do the daily average, I use the climatology for anomaly calc, by day number in the year.
So this year, I was going to run out of days, but I expected to just use the adjacent day climatology at the end. But that would put the effective leap day at Dec 31 instead of Feb 29, which creates the discrepancy, and that would affect the rest of 2016. So now I use the right climatology for the calendar day. It brings March average up from .755°C to 0.783.
But the other point of interest is that except for this, the other data from ERSL PSD matches very well. I can't get sig995 there, which I use because of its greater completeness, but they do give monthly averages of surface temperature here. And the differences between my monthly average anomalies since Jan 2014 and theirs has a mean of .0003 and a sd of 0.0012. So it agrees to 3 sig fig.
That gives reassurance about not only the daily values, but also the equivalence of the sig995 I use and their surface air temperature.
Wednesday, April 6, 2016
Subscribe to:
Post Comments (Atom)
There are actually only 4 Feb 29ths in your baseline. 2000 was not a leap year.
ReplyDeleteI have wondered about the proper way to deal with the leap-year issue in analysis of global temperature data. Simply omitting these days would seem to invite the introduction of a spurious four-year periodic signal in the time series. A small signal, admittedly, but still. Is there a standard method to deal with this annoyance?
The year 2000 was in fact a leap year. Years evenly divisible by 100 but not 400 are not leap years, but those divisible by 400 are. This closely approximates the true length of a year (365.242199 days vs. 365.2425 days).
DeleteYou are correct. Sorry. My mistake.
DeleteAnon,
DeleteThere is a periodic signal anyway. Every year it slips back 6 hours in phase, and has to be corrected at some time. Leap year is the conventional remedy, which makes a sawtooth. I could do something smoother, but then I would be out of line with others.
You could do something different, such as interpolate calendar day values to a 365.25 day year with the appropriate centering based on the date and time of day of the June solstice, but that would more than likely just confuse or annoy people.
DeleteIt's what I would do if I had some requirement to compare annual variations or trends that for some reason needed leap years to be dealt with in a careful, nitpicking sort of way.
I don't know where to post else, but the NSIDC sea ice date makes again strange spikes now even in the NH.
ReplyDeletehttp://www.moyhu.blogspot.de/p/latest-ice-and-temperature-data.html
Yes, I was looking at that. At first I thought they had just one figure wrong (million sq km), but I think it is more. And the SH is worse. I expect they will fix it soon.
Delete