But the meme persists, and metrology handbooks get quoted - here a JGCM guide. But the theory quoted is for a single measurement, where repeated measurements can't overcome lack of resolution. But that isn't what is happening in climate. Instead a whole lot of different measurements are averaged.
Of course, averaging does improve accuracy. That's why people incur cost to obtain large samples. In this post, I'll follow my comment at WUWT by taking 13 months of recent daily max in Melbourne, given by BoM to 1 decimal place, and show that if you round off that decimal, emulating a thermometer reading to nearest degree, the difference to the monthly average is only of order 0.05°C; far less than the reduction in resolution. But first, I'll outline some of the theory.
Law of Large Numbers
This goes back to Bernoulli. There was much confusion at WUWT with the central limit theorem, which is not at all the same. The Law of Large Numbers (LoLN) deals with convergence of a sample mean to a population mean with larger samples (lots of formulations) whereas the CLT makes the more interesting claim that the sample mean, as a random variable itself, tends toward a normal distribution, even though the individual samples may not have been normally distributed. There are of course caveats.The LoLN is what is needed here, and at WUWT a somewhat informal Wiki statement was mentioned: "The average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed." The author (whose comment oddly disappeared) was reproved by Willis, dissing wiki and preferring: "The law of large numbers formulated in modern mathematical language reads as follows: assume that X1, X2, . . . is a sequence of uncorrelated and identically distributed random variables having finite mean μ …" and emphasising uncorrelated, iid etc.
The general idea of LoLN seems simple nowadays. If you add two independent random variables, the variance of the sum is the sum of the variances (subject to conditions like that they actually do have variances, but not requiring normality or identical distributions). If you have a set of independent random variables εi, consider a weighted average
A = Σ wiεi, with Σ wi = 1
Scaling can be absorbed in the weights, so they might as well be unit variables. Then the variance of A is Σ w²i. If it is a simple mean of N variables, w=1/N and the sum is 1/N. But if not, or if the variables have different variance, the convergence of the mean is still just a property of that diminishing sum.
What about correlation? If the unit variables have a correlation matrix K, then the combined variance is Σ wiKijwj. Does that converge? Well, it depends on K. If its coefficients do not tend to zero away from the diagonal, it may not. Again if w are uniform 1/N, the sum will be over all N² coefficients. But usually correlation does diminish as the variables become more separated in time or space.
I've included this to show where LoLN comes from, and that lack of iid is not a show stopper.
Resolution
To be specific, suppose we have a thermometer read to an accuracy of 1°C, and a succession of temperatures T are coming in with a spread much larger than 1. Suppose we actually know the T values, but they are then read to resolution - ie rounded.This is equivalent to displacing each reading Ti by an amount εi up to 0.5°C to nearest integer. That JCGM guide puts it thus (via Pat Frank at WUWT):
"If the resolution of the indicating device is δx, the value of the stimulus that produces a given indication X can lie with equal probability anywhere in the interval X − δx/2 to X + δx/2. The stimulus is thus described by a rectangular probability distribution of width δx with variance u^2 = (δx)^2/12, implying a standard uncertainty of u = 0.29δx for any indication."
So the cost to accuracy of the mean is a mean of those variables. It is very reasonable to assume them independent. Although temperatures themselves may be correlated, the fractional parts will be much less so, if the assumption that the resolution is well finer than the total temperature range holds. The distributions are uniform, so the standard error of the mean of N such is sqrt(1/12/N). As such, it tends to zero with large N. That is, the mean discrepancy between rounded and exact, Σ εi/N, behaves like sqrt(1/12/N).
You may say, what if the rounding isn't perfect? What if, say, .4 is sometimes rounded up instead of down. That just changed the uniform distribution to something similar with a slightly different variance.
Example - Melbourne maxima.
On pages like this, BoM shows the daily max for each recent month in Melbourne, to one decimal place. I have placed here a zipfile which contains a RData file (to load in R) called melb12.sav, which has a list of dataframes with full data for those months. There is also a file called melb13.csv, which has just the maximum temperatures that were used in this test. Here is last month (Mar):
33.7 34.7 23.9 33.0 23.7 25.2 24.9 38.9 28.5 22.1 26.1 22.3 23.2 21.3 26.8 31.4 32.5 19.5 18.8 23.3 23.5 24.3 28.8 21.2 20.4 20.2 19.9 19.2 17.9 18.7 22.7
Suppose we had a thermometer reading to only 1°C - so all these were rounded, as in the JCGM description. For the last 13 months, here are the means for the BoM (1 dp) and for that thermometer:
Mar Apr May Jun Jul Aug Sep Oct Nov dec Jan Feb Mar 1 dp: 22.72 19.24 17.13 14.43 13.29 13.85 17.26 24.33 22.73 27.45 25.98 25.1 24.86 0 dp: 22.77 19.27 17.13 14.37 13.29 13.84 17.33 24.35 22.67 27.48 26 25.17 24.84 diff: 0.05 0.03 0 -0.06 0 -0.01 0.08 0.03 -0.06 0.03 0.02 0.08 -0.02
The middle row, measured by day to 1°C, has a far more accurate mean than that resolution. As a check, the sd of the difference (bottom row) is expected from above to be sqrt(1/12/31) (slight approx for days in month), which is 0.052. The sd of the diffs shown is 0.045. The monthly average at 1 C resolution is accurate to about 0.05°C.
I think there's another point that some of the WUWTers may be stumbling over. The thing we're trying to determine is the global mean temperature; the "random variables" being discussed here should be the errors on any given temperature measurement. It's these errors that are uncorrelated; these are the random independent variables under discussion.
ReplyDeleteTemperatures are obviously correlated in both time and space.
The spatial and temporal heterogeneity adds an extra layer of complexity to calculating a global mean temperature that is sure to confuse some.
A good example of this is the technique of oversampling and decimation used to increase the precision of a analog to digital converter
ReplyDeletehttp://www.atmel.com/images/doc8003.pdf
Might mention it over there Eli is banned
where? at WUWT?
DeleteThanks, Eli. I think the thread at WUWT has now expired, but I'll certainly bring it up if there is a recurrence. It's a very interesting technique, although the extra complexities of adding noise etc could meet resistance there.
DeleteHi Nick, the need to add noise is well illustrated by the example over there that somebunny used of what do you get when you have a ruler with only inch marks and a board to measure. If you don't add noise you get the same number all the time and no improvement in precision. If you add enough unbiased noise to jiggle the measurement between marks on the ruler, the precision will improve.
ReplyDeleteAnyhow, Eli tried to post this and they did. Only banned mostly now Eli guesses.
That was a tough slog over at WUWT. The level of aggressive numbskullery there is that site's version of a skunk's odor warning others to keep away. One of the clowns commenting even claimed that "we may even have had cooling over the last 150 years and would not know!"
ReplyDeleteBut it's informative to see the sheer range of misunderstandings and basic errors that so many WUWT contributors and commenters hold just on basic facts on instrumentation and simple statistics. How such individuals can go from that base straight to questioning much more complex analyses is a good example of the Dunning-Kruger effect in action.
The problem is that - not understanding the science is unfortunately but inevitably, a barrier to understanding the science
ReplyDeleteAs someone once said - "a good man has got to know his limitations"
As a metrologist, I'm surprised anyone would use metrology as an argument against averaging. One must take a series of readings and average if only to know the short-term repeatability to calculate uncertainties. And of course anyone with half a brain, metrologist or not, quickly understands that averaging adds precision.
ReplyDeletePerhaps the mental stumbling block is that averaging readings from one device adds precision - not accuracy, but averaging multiple devices adds both precision and accuracy.
I once performed a simple experiment where I showed co-workers that I could get more accurate results from twenty-five 6 1/2 digit voltmeters than from one 8 1/2 digit voltmeter - even though the 8 1/2 digit voltmeter has a presumed accuracy 50 times better than the 6 1/2 digit voltmeters. I did 'cheat' a little by using statistical bootstrapping to increase the effective sample size from 25 to 1000. I would have to go back and find the final results, but the reduction in error was approximately from 85 ppm for a single 6 1/2 digit voltmeter to low single digit ppm error after bootstrapping.
Help me with this Kevin. I can see the improvement in accuracy averaging thermometer readings. I cannot see it with the voltmeters, unless you assume that their errors are evenly distributed about a correct indication. Maybe I don't understand what accuracy means in this context.
Deletej ferguson - you have it in one; we expect any group of independent readings to have a distribution aroumd the 'true' value. The main caution here for me was that I had to find 25 voltmeters that were not correlated. I.e., if they had all been calibrated by the same laboratory, then we might suspect a bias in one direction or another. Similarly, if they had all come off a production line the day before, then we would also suspect a systemic bias.
DeleteHaving access to large amounts of data on instrument readings I am constantly reminded that "math works" :)
Thanks kevin. it was easy for me to imagine that all the meters were sent to the same lab for calibration and were all off by the error of the 'standard'
DeleteIt is so embarrassing that these people pretend to know science better than scientists.
ReplyDeleteThere is one case, however, where I do worry a little about reading a thermometer to one degree. In the 19th century, the thermometer to measure the sea surface temperature is typically stored in the warm cabin. It then has to be stirred in a bucket of sea water until the thermometer has reached the temperature of the water. I wonder whether the voluntary observers / sailors waited long enough until the the warm bias is gone within 0.1°C when they read the thermometer to 1°C.
(The abbreviation LOLN is not explained and written as LLON a few times.)
Victor,
ReplyDeleteThanks for the LLON warning - all fixed now, I hope. Yes, I said the thread was interesting, but it ended up in sheer nuttiness from Pat Frank.
I tried to stick to just the actual thermometer reading, without getting into whether the reading was of the correct thing, or even whether the thermomoeter had stabilized. Yes, there are certainly ways in which bias could be introduced, even into reading - for example, if people tend to round down when it should be up.
As Victor says, it is rather amazing that these people really do seem to think that they understand this better than professionals who've worked on this for a very long time. I wonder if it isn't simply different environments. I've been sitting through scientific seminars for a very long time. Something I certainly learned quite early on is that if you think the speaker has made some kind of silly mistake that it is more likely that I was wrong, or misunderstood what was being done, than the speaker having made the kind of mistake that seemed obvious to someone who had only just encountered their work.
ReplyDeleteI fully agree. Until the moment you are an expert yourself, it is a good idea to practise humility, train your ability to ask questions and to listen. The expert likely knows something you do not (yet).
DeleteIt's a good idea to go on practising humility even beyond the moment when you believe yourself to be an expert.
Delete:-) Yes. I was thinking that when you are an expert there may also be a case where you think everyone is wrong and then you should also have the courage to say so.
DeleteYup, all these experts such as Richard Lindzen, Judith Curry, Murray Salby, etc. Better listen to them, lol.
Delete...and Then There's Physics said
ReplyDelete"Something I certainly learned quite early on is that if you think the speaker has made some kind of silly mistake that it is more likely that I was wrong"
Good that you are speaking for yourself ... when I listen to a speaker make up "just so" stories to explain climate science, I realize that there is so much more left to understand.
“Anti-intellectualism has been a constant thread winding its way through our political and cultural life, nurtured by the false notion that democracy means that 'my ignorance is just as good as your knowledge.'” --Asimov
ReplyDeleteThe problem is: when you have one person who's ignorant of a field and yet certain they're right about it, how do you communicate across that gulf of understanding?
Competence is something that must be earned and then demonstrated. But hell if I can get the 'skeptics' to grok that.
"The problem is: when you have one person who's ignorant of a field and yet certain they're right about it, how do you communicate across that gulf of understanding? "
DeleteThe converse problem is the one person who claims to be the authority in a field and is certain that they are correct on a topic, and then uses that to their advantage in halting further progress.
The best example of that is Richard Lindzen, who in my opinion has done no favors to the discipline of atmospheric physics. I found this quote recently: " More importantly, he's been wrong about nearly every major climate argument he's made over the past two decades. Lindzen is arguably the climate scientist who's been the wrongest, longest. " I would amend that to the wrongest, longest, and LOUDEST.
Hi Nick,
ReplyDeleteI hope you're doing well. This is Janet from WUWT.
I'm currently engaged in a debate with Pat Frank regarding uncertainty in global temperature records. My main argument is that if the uncertainty were as large as Pat claims, we would expect to see significant divergence among independent datasets—particularly between satellite and surface records. But in practice, we don’t. Most discrepancies seem to stem from differences in methodology, not from massive underlying uncertainty.
Pat, however, tends to dismiss independent corroboration, asserting that his paper is correct regardless. When I brought up the strong agreement between the USCRN and the adjusted U.S. surface temperature record, he dismissed it as coincidental. He also argues that temperature adjustments have done more harm than good, particularly outside the U.S.
As support for his view, he linked to an older article by Willis from December 2009:
https://wattsupwiththat.com/2009/12/08/the-smoking-gun-at-darwin-zero/
I'm not entirely sure how to interpret this article. Given that this is a WUWT article, I am skeptical. I'm not as familiar with the process of homogenization as you are, which is why I wanted to reach out to you for your perspective on this article.
Our conversation is ongoing and can be found here:
https://wattsupwiththat.com/2025/07/08/climate-oscillations-7-the-pacific-mean-sst/#comment-4091447
Hi Janet,
DeleteI'm well, thanks, and very glad to see your contributions at WUWT. Yes, the absurdity of Pat's uncertainties is shown not only by agreement between datasets, but in the datasets themselves. Over time, you'd expect to see them fluctuate near the limits of the uncertainty envelope. But they don't.
I've written quite a lot about Pat Frank's uncertainty stuff; you can search for Pat Frank at Moyhu. Here is an example
https://moyhu.blogspot.com/2019/09/another-round-of-pat-franks-propagation.html
As it happens, respondingto Willis was Moyhu's very first blog post
https://moyhu.blogspot.com/2009/12/darwin-and-ghcn-adjustments-willis.html
Unfortunately the diagrams have become invisible; I'll try to restore them. But the essence is that you can look at a histogram of trends. There is, for the US, a slight bias towards positive, caused mainly in the USHCN data of the day by TOBS adjustment (there are reasons). But it's certainly noy all one way. I showed Coonabarrabran, in NSW, where adjustments had an equivalent cooling effect. No-one cared.
But I think the best counter is just to calculate the average with unadjusted data. That is what I do with TempLS. I'd prefer to use adjusted, but it actually makes little difference, as I showed here:
https://moyhu.blogspot.com/2015/02/homogenisation-makes-little-difference.html
with breakdowns here:
https://moyhu.blogspot.com/2015/02/breakdown-of-effects-of-ghcn-adjustments.html
Hi Janet,
DeleteI restored the images at the old Darwin post
Nick
Thanks for your help, Nick. This site is an excellent resource. Please keep up the great work. I just submitted a reply, but it seems to have gone into moderation. Any idea why?
DeleteWUWT may be many things, but it's never struck me as a place that censors discussion. I also doubt it's anything to do with my behavior. After all, even Michael Flynn, who regularly insults people for endorsing the greenhouse effect, seems to post without issue. I do use the dreaded 'd-word,' but that's about the extent of it.
Hi Janet,
DeleteThanks. WUWT is, in recent years, tolerant of dissent, though they do get upset about deniers. A common reason for moderation is having more than three links.