## Monday, October 31, 2011

### ## GWPF is wrong, warming has not stopped.

The Daily Mail says that BEST results show that global warming has stopped, citing this graph from the GWPF. And there's an interview with Judith Curry talking about someone "hiding the decline".

Tamino has quite correctly taken this apart. This is his version of the GWPF graph:

As he points out, the claim of no warming is dependent on the dip in April/May 2010. Without it, there is warming, pretty much as expected.

So I checked the BEST data.txt to see why these month data had such large error bars, and were so out of line. It turns out that all the data they have for those months is from 47 Antarctic stations. By contrast, in March 2010 they have 14488 stations.

How this could have happened, I don't know. Anyway, a list of those 47 stations below the jump.

 Name Lat MIRNYJ -66.5400 SYOW -69.0000 Elizabeth -82.6000 AMUNDSEN-SCOTT -90.0000 NICO -89.0000 LETTAU -82.5180 GILL -80.0090 BYRD -80.0000 MARILYN -79.9500 SCHWERDTFEGER -79.8750 MINNA BLUFF -78.5500 VOSTOK -78.4800 LINDA -78.4755 PEGASUS NORTH -77.9250 FERRELL -77.8840 MCMURDO SOUND NAF -77.8734 NAVY OPERATED(AMOS) -77.5000 CAPE ROSS -76.7170 HALLEY BAY -75.5000 DOME C II -75.1210 MANUELA -74.9460 BUTLER ISLAND -72.2060 POSSESSION ISLAND -71.8910 NOVOLAZAREVSKAJA/LAZREV-1/61 -70.7670 GEORG VON NEUMAYER, FRG ANT.. -70.6608 ZHONGSHAN WX OFFICECI -69.3670 DAVIS -68.5777 ROTHERA POINT -67.6062 MAWSON -67.6000 LAW DOME SUMMIT -66.7330 DUMONT D'URVILLE, FRANCE ANT. -66.6670 CASEY -66.3320 B. A. VICECOMODORO MARAMBIO -64.2330 BASE ESPERANZA -63.4000 ISLAS ORCADAS B.N. -60.7360 MACQUARIE ISLAND -54.4941 MARION ISLAND -46.8738 GOUGH ISLAND -40.3500 Henry -89.0000 Limbert -75.4000 Mario Zucchelli -74.7000 GENERAL SAN MARTIN B.E. -68.1300 Larsen Ice Shelf -66.9000 BELLINGSHAUSEN -62.2000 TENIENTE JUBANY E. C., ARG. -62.2000 GRYTVIKEN S. GEORGIA IS. -54.2700 Campbell -52.0000

## Thursday, October 27, 2011

### ## A Javascript index for Moyhu

This is the second month of running a TempLS analysis ahead of the major indices, and coincidentally with the same result. August's mean land/sea temperature anomaly came out the same as July - 0.45°C, relative to the 1961-1990 base period. The data and plots are at the latest ice and temperature data page.

Below is the graph (lat/lon) of temperature distribution for August, using GISS colors, levels and base period

And here, from that data page, is the plot of the last four months:

## Tuesday, October 25, 2011

### ## BEST has same data as GHCN pre-1850

BEST extended its temperature series back a lot further than its predecessors, to about 1800. Many have assumed that this was because they have more early data, but this is not true. Their data set in this period is pretty much identical with GHCN.

You can see this from the KMZ file or the interactive Javascript plot. But I thought I should check it out in detail, so I have below a table of the actual stations, in one case from GHCNV2 (V3 is the same) and in the other BEST, taken from their data.txt file.

There are very few discrepancies. Out of about 275 pre-1850 stations, I count only 6 which BEST has and GHCN doesn't. There seem to be 9 than GHCN has that BEST doesn't, and 8 that BEST has included apparently twice. Details below the jump.

Here is a complete table of stations with pre-1850 data.. The color scheme is:
 Black BEST Red GHCN Purple BEST unique Blue GHCN unique Green BEST duplicate

The first table is just of the exceptions. I have colored STYKKISHOLMU but not counted it, because it is in both, though BEST seems to have about 20 more years data. Generally I did not color if there is a discrepancy of less than three years in start date. I have shortened names to 12 characters.

 Name Lat Lon Alt Start End BERLIN-DAHLE 52.47 13.3 58 1769 2009 Wien - Hohe 48.2 16.4 203 1775 2009 UCCLE BELGIU 50.8 4.4 100 1794 2009 SHIP V 34 164 -999 1808 1829 EAST MILTON 42.22 -71.12 193 1811 2010 KIEV GMO 50.4 30.53 170 1812 2010 SAMSUN 41.28 36.33 4 1819 2010 FORT SNILLIN 44.9 -93.2 245 1820 1982 NEW YORK AVE 40.63 -73.96 7 1822 2009 CHARLESTON I 32.83 -79.97 8 1823 2010 STYKKISHOLMU 65.08 -22.72 10 1823 2010 MACDILL AFB/ 27.85 -82.52 6 1825 2010 ITHACA CORNE 42.46 -76.46 292 1827 2010 KIRKWALL 59 -2.9 26 1827 2009 Helsinki/Kai 60.2 25 4 1829 2000 BLUE HILL OB 42.22 -71.12 195 1831 2006 STENNIS INTE 30.33 -89.44 6 1833 2010 WAVELAND 30.3 -89.38 2 1833 2006 AMHERST 42.38 -72.53 45 1836 2010 Graz-Univers 47.1 15.5 366 1837 2001 SALZBURG-FLU 47.8 13 439 1842 2010 WEST CHESTER 39.97 -75.63 137 1843 2006 STYKKISHOLMU 65.08 -22.73 8 1846 2010 SAN FRANCISC 37.62 -122.38 5 1847 2010 SATA FE COUN 35.64 -106.03 2039 1849

And here is the full table, including exceptions, ordered by start date:

## Monday, October 24, 2011

### ## World coverage by decade of BEST, GHCN, GSOD and CRUTEM3

Update: I see that the plot does not show in Internet Explorer - I'm trying to find out why. It works in Firefox, Chrome and Safari.

In the previous post, I talked about a KMZ file which would display the stations of the four temperature databases, BEST, GHCN, GSOD and CRUTEM3. They were set in folders so that different starrting dates could be displayed.

This post provides a JavaScript interactive display with the same general intent. There are a total of 52 images, each showing a single database in a single decade (approx). It shows the stations that returned any data in that decade.

Here you see just a single image. There are two legends, one with decade and one with database. You can click on the legend to bring up any combination. There is a control with a square and four triangles. The triangles just navigate up and down the menus in the way they point. The square enables you to toggle rapidly between the last two images shown. The intent is that you can set up pairings and compare.

The images are quite high resolution (1600x960 pixels) so you can use the Ctrl+ and Ctrl- controls to enlarge and see more detail. When you click on an image for the first time, there is a short pause while it downloads.

As you'll see, the daily data, GSOD, has good coverage at present, but doesn't go back far. GHCN goes a long way back, and the BEST coverage seems to use much the same data, with just a few more ststions. But try it out - there is a lot you can test.

## Saturday, October 22, 2011

### ## A combined KMZ file for BEST, GHCN, GSOD and CRUTEM3

Update I have modified the ALL4.kmz file, which you can download here. I updated the data.txt file, where BEST had mistakenly posted the MAX file. And I found the problem which had led to the previous version showing very little early BEST data. BEST had a  column saying how many days in each month had readings, and I set a filter to require at least 10 days. However, much of the early period had this set to -99, which meant the data was rejected. I have removed the filter, and now there are a lot of pre-1850 sites.

This is a development foreshadowed in the previous post. I have put a combined .kmz file with data from 4 land station datasets. GHCN is actually v2, but there is very little difference at this level between v2 and v3. CRUTEM3 is the version released in July, discussed here. GSOD was discussed here.

There is now much more information - in fact, when you open the file in Google Earth, it looks colorful but cluttered. The pushpins are colored according to dataset - yellow for BEST, green for GHCN, red for GSOD, and a sort of dull green for CRUTEM3. They also vary in size - the smallest has 0-30 years of data, next has 31-60, and the largest has more than 60.

But the key to looking it is that the data is stored in folders. At the top level, there is a folder for each dataset. At the next level down, they are classified according to start year of data. The ranges are 0-1850, 1851-80, 1881-1900,1901-1920, 1921-30, 1931-40, 1841-50, 1951-60, 1961-70, 1971-80, 1981-90, 1991-2000, and 2001-2011. As I'll show in the next picture, in Google earth you can toggle on/off at any level. If a dataset is on, you can toggle the year folders. If you want to see in any set the years before 1921, just toggle off the later folders.

Update - I had a warning briefly that there were spurious sites in the BEST folder. These had start years of -9.9999 and so went into the pre-1950 folder. That is fixed, but there's a new problem that the pre-1850 folder is almost empty. That could be real, but the BEST Team have done analyses for this period. See below for a discussion of the data.Fixed

The Start year, End year and Duration have been added to the pop-up balloon that you get by clicking on a station. The file is called ALL4.kmz, and can be found here.

 Here's a GE snapshot of the toggle facility. You have to click on a few +s to see this. GSOD and CRUTEM3 are not visible. BEST shows only stations with data before 1971. GHCN is visible, but you'd have to open that menu to see which years. I had it matching BEST.

Added: To get the data years for BEST, I used the data.txt file in their PreliminaryTextDataset folder. That looks right, but I need to investigate to see if it includes everything. There seem to be early stations missing.

### ## A KMZ file for the BEST stations

Update - the latest post points to a more comprehensive KMZ file
In the BEST text data zip (warning - 200 Mb), there is a listing of 36736 stations in the file site_detail.txt. I've made a KMZ file (1300 Kb), which is in the file repository under the name "best.kmz". If you download it and click on it, it will bring up Google Earth (if you have it installed) with all the stations marked with yellow pushpins.

If you click on a pushpin, a balloon will pop up with some minimal data (Name,ID's, Altitude, Lat/Lon). Later if I get some analysis done, I'll produce versions with folders, colors and more info. Here's a GE snapshot:

## Friday, October 21, 2011

### ## The Berkeley Surface Temperature (BEST) analysis

I woke up this morning and saw that the BEST analysis papers and data where online. And there were already posts at Judith Curry's, WUWT, Tamino's and Stoat, and soon one by Zeke at the Blackboard. The dynamic blogroll here put Zeke's upbeat "Some interesting results from BEST" directly above Stoat's pithier "BEST is boring".

My expectations had been somewhat more in line with Stoat's, and indeed the concensus seems to be that it confirms what had been known. But I was interested in the analysis, mainly because the claimed novelty was the least squares method that TempLS uses, which I thought David Brillinger would have improved considerably.

One of the minor surprises was that DB was not on the list of authors of the main analysis paper, despite being one of the big names on the Team. However, he is acknowledged handsomely, and the sophistication of the statistics does indicate his contribution.

There have been a number of other analyses in the last two years which, like BEST, confirm the major indices. Some were land-only, others included sea surface temperatures.

So, with that preamble, here is a very preliminary discussion, mainly of the paper Berkeley Earth Temperature Averaging Process.

The main thing that has puzzled me about this project is that they have restricted themselves to land-only. I heard that the funding ran out before they could include oceans, but I thought the priority was odd. There isn't much point in fancy stats when 2/3 of the globe is missing. They go into a discussion of the different ways the land-only indices handle this problem. They, like NOAA, treat the stations as representing land only. GISS weights the stations in such a way that they attempt to represent the whole globe. BEST says the issue has had insufficient discussion. I think that the BEST discussion fills a much-needed gap in the literature. The land-only indices should really be of only historical interest.

Anyway, they have done some things in the modelling I had been putting off, specifically kriging. TempLS analysis is based on OLS, which effectively assumes that the residuals are independent, identically distributed. Of course that isn't true, and it is possible to improve. I had that on the back burner, mainly because I suspected it would make no real difference to the mean estimate, though it gives a better handle on the error. I think that BEST have affirmed that, which is valuable.

Their model (eq (4)) is
$$d_i(t_j) = \theta(t_j)+b_i+W(\vec(x)_i,t_j) + \epsilon_{i,j}$$
for the temperature in the i-th station at the j-th timestep, and bi is the baseline temp at station i. θ is the global variation. The TempLS equivalent (omitting the noise ε) is:
dsmy  ~  bsm  +  θy

TempLS decomposes timestep into years y, months m and s corresponds to i above. A big apparent difference is that TempLS includes seasonality in the baseline. It's not currentrly clear to me what BEST does here. The online summary says that they allow for it at a later stage. The TempLS global function can be assumed to vary with year only or with year and month. For trends it makes little difference to the result.

BEST's weather term W  is estimated from correlations. I'm not sure why it is done this way rather than including the correlation in the weight function.

TempLS does a weighted fit in which the weight is by area density (and zero for no record). BEST put a lot mote into their weighting. They use a kind of kriging which adds spatial structure, and they weight by the confidence they have in the reading. Naturally this is only very approximately estimated.

An interesting sub-discussion is on the number of degrees of freedom. They say that 180 well-placed stations should be enough, echoing a figure of Phil Jones. Others  say even sixty would do, and TempLS did this analysis, fairly successfully.

TempLS avoided homogenization (with little apparent effect); BEST introduce a new technique - the scalpel. Or fairly new - they break the series into two when they suspect a station discontinuity, and reduce the weighting nearby to compensate for the greater uncertainty. I suspect the loss of information here is rather great - with the lower weighting, of course, but also the much lower confidence in the baseline means of the fragments (assumed independent) rather than the higher confidence you would have in a longer period.

They also intervene to identify outliers - not by removing them but by reducing the weighting (with similar effect). There are some dangers in this, and one needs to think with a dataset like GHCN how it interacts with the efforts that have already been made to remove outliers. "Outliers" in a preprocessed data set are much more likely to represent real data.

They do a much more sophisticated uncertainty analysis than TempLS, and I like their jackknife approach, and will probably use it. On the other hand, I think their spatial uncertainty analysis is rather pointless when they cover land only.

They break the station baseline temperatures (climatologies) down by latitude and altitude. I'm not sure why, as it isn't helpful for an analysis of global temperature. It is of independent interest, but the analysis could have been done separately on the derived baselines.

There will be several posts to come. I've only just started to look at the Matlab code. An ambition is to port it to R, but there is quite a lot of it. Currently the zip file of text data is somewhat corrupted - I've managed to get all but the flag.txt file, which probably is not needed to get started. Of course, the paper discussed here uses GHCN data only. Maybe I'll be able to compare BEST and GISS on the same data.

## Monday, October 17, 2011

### ## GISS Sep 11 - down 0.13°C

Giss Data is out, so, as I have done recently, I'll compare with the TempLS calculation. The monthly global mean anomaly average was down from 0.61°C to 0.48°C (1951-1980 base). TempLS had a small decrease (0.025°C). NOAA had an even smaller one, UAH slightly larger, and RSS had a minute rise. Numbers and plots are here.

Below the jump are the GISS global plot and TempLS. Similar features, but GISS showed larger excursions. This is to be expected, as the TempLS fitting process has a smoothing effect.

Here is the GISS Plot:

And here is the TempLS plot. It comes from this post  which also showed an interactive spherical projection.

## Saturday, October 8, 2011

### ## September GMST - TempLS down from 0.444 to 0.42

The TempLS analysis, based on GHCNV3 land temperatures and the ERSST sea temps, showed a slight cooling from August. The August temp itself came down from 0.45 °C to 0.444°C with inclusion of late data, and September's mean land/sea temperature anomaly came out as 0.42°C, relative to the 1961-1990 base period. The data and plots are at the latest ice and temperature data page.

Below is the graph (lat/lon) of temperature distribution for September. There is also an interactive world map.

This is done with the GISS colors and temperature intervals, and I'll post a comparison when GISS comes out.

And here, from the data page, is the plot of the last four months:

Finally, here is the interactive worldview of Sept surface temperatures. Just click on the yellow squares to see different views of the Earth. The bottom square is the S pole view. The first click takes a second or two for loading.