tag:blogger.com,1999:blog-7729093380675162051.post2670589452930230548..comments2024-03-28T13:56:47.604+11:00Comments on moyhu: Ryan's code - S09 with more PCsNick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comBlogger43125tag:blogger.com,1999:blog-7729093380675162051.post-29502123571373029862011-02-22T14:49:34.735+11:002011-02-22T14:49:34.735+11:00Nic ans cce
On your OIv2 discussion look in the R...Nic ans cce<br /><br />On your OIv2 discussion look in the Reynolds Et Al 2002 paper. It appears to be a paper showing the method of the OIv2 method when it was first made and might answer your questions:<br />ftp://ftp.emc.ncep.noaa.gov/cmb/sst/papers/oiv2.pdfAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-76803031715466645002011-02-22T03:37:16.073+11:002011-02-22T03:37:16.073+11:00Nic,
I think the in situ data is used to correct ...Nic,<br /><br />I think the in situ data is used to correct the AVHRR and AMSR measurements. I may be wrong as the Reynolds paper isn't completely clear, but the only time the in situ data is described, it is with regard to correcting biases in the satellite data.ccehttps://www.blogger.com/profile/03646816472336349526noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-12579097462389942552011-02-22T01:31:14.272+11:002011-02-22T01:31:14.272+11:00cce,
The reference in the readme file at the URL ...cce,<br /><br />The reference in the readme file at the URL you gave is to a 2006/7 paper by Reynolds, which says in-situ data is also used. AMSR data is used as well as AVHRR in recent years. But the dataset may indeed provide a better spatial correlation matrix than AVHRR data alone.<br /><br />I had misunderstood what you were suggesting about the GSOD data. I will look at this when I return to working on this area.NicLnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-34516463789587954822011-02-21T14:00:48.090+11:002011-02-21T14:00:48.090+11:00Nic,
As far as I know, the Reynolds OIv2 data is ...Nic,<br /><br />As far as I know, the Reynolds OIv2 data is a pure satellite product (1982+). This is in contrast ERSST (also Reynolds) which is a traditional SST analysis.<br /><br />I was offering the GSOD data as an alternative to using satellite data for the land. There are 3+ decades of relatively dense land measurements. Applied to a grid, you'd get a pretty good representation of almost all of the land surface. In other words, establish the spatial relationships with the GSOD data, and then use a "really rural" subset of GHCN to extend it backwards in time.ccehttps://www.blogger.com/profile/03646816472336349526noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-66267207375101118442011-02-21T09:38:01.584+11:002011-02-21T09:38:01.584+11:00cce,
At the time I was doing htis work I had a re...cce,<br /><br />At the time I was doing htis work I had a relatively limited knowledge of the available datasets (I still do, but less so!), and was really just experimenting. I had since concluded that using AVHRR rather than TLT satellite data would make much better sense for SST, but I haven't got around to doing so. <br /><br />The Reynolds SST data is already a composite of satellite and in-situ reading, made using a different method from mine, so it wouldn't really makes sense to use it, IMO.<br /><br />I agree that the MODIS land temperature data may be preferable to the TLT data, but I doubt the length of data available is long enough yet.<br /><br />I agree that processing data regionally would improve results. <br /><br />I did try to pick truly rural stations in North America, insofar as I could. Elsewhere, in most regions I couldn't find very many long record truly rural stations in the GHCN database. The GSOD data at The Whiteboard you refer to (posted some time after my work) seems to have very limited global coverage pre 1950, and therefore may not be of much help.NicLnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-28198901501702304432011-02-21T04:15:26.302+11:002011-02-21T04:15:26.302+11:00Nic,
That's interesting. I'm curious as ...Nic,<br /><br />That's interesting. I'm curious as to why you didn't use the Reynolds SST, which would give you a much better correlation than TLT.<br /><br />http://www.ncdc.noaa.gov/oa/climate/research/sst/griddata.php<br /><br />For land, there is Ron Broberg's GSOD data, which isn't 100% spatially complete, but a lot denser than GHCN after the 1970s. e.g.<br /><br />http://rhinohide.wordpress.com/2010/06/27/gistemp-filtered-gsod-stations-a-pretty-chart/<br /><br />There is also the MODIS/Terra LST since March 2000 (not sure if that's long enough). It has its own various problems due to changing land surface, but it might also be an option.<br /><br />https://lpdaac.usgs.gov/lpdaac/products/modis_products_table/land_surface_temperature_emissivity/monthly_l3_global_0_05deg_cmg/mod11c3<br /><br />For your long temperature series, you could choose stations that are "really rural" -- ones that don't show up in or near the union of all the various Urbanity proxies (such as those that Ron Broberg or Steve Mosher have been working on).<br /><br />You could process various regions seperately to avoid some of the problems you encountered. You might also choose the datasource that that works best for a particular area.ccehttps://www.blogger.com/profile/03646816472336349526noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-76328078824829550452011-02-21T02:39:22.162+11:002011-02-21T02:39:22.162+11:00CCE,
You might like to look at a post I did at tA...CCE, <br />You might like to look at a post I did at tAV a year ago: <br />http://noconsensus.wordpress.com/2010/02/13/7686/<br /><br />I combined long record GHCN station data with satellite TLT data (available worldwide except near the poles, unlike AVHRR data) to generate a spatially complete temperature history using RegEM for station infilling and RLS for spatial reconstruction.NicLnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-68057236609267279182011-02-20T10:39:26.222+11:002011-02-20T10:39:26.222+11:00Every region becomes sparse the further back you g...Every region becomes sparse the further back you go, so I think it would be useful just about everywhere. You could divide the world into logical regions and then assemble the results.<br /><br />Speaking of the Arctic, how would it work with sea ice? The AVHRR data would only work for areas and times of the year where it is guaranteed to be sea ice or guaranteed to be open water. I would think that (and the sea-ice covered areas down south) would be the trickiest of all.ccehttps://www.blogger.com/profile/03646816472336349526noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-40277489232763759502011-02-20T09:24:59.807+11:002011-02-20T09:24:59.807+11:00CCE,
Yes, I think it might be useful for the Arcti...CCE,<br />Yes, I think it might be useful for the Arctic, and maybe places like Central Africa.<br /><br />The key idea is the use of the satellite eigenvectors (EOFs)as fitting functions for sparse station regions. That is diluted when those EOF's get to look like any other set of smooth functions that you might choose. I suspect that would happen if it was tried on a global scale. But it would be good for other specific regions.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-75695312850357250842011-02-20T07:54:35.256+11:002011-02-20T07:54:35.256+11:00This may be obvious for the smart people among us,...This may be obvious for the smart people among us, but can't this method it be applied to other parts of the world? We have AVHRR SST since 1982. Ron Broberg's GSOD data (and presumably the upcoming GHCN updates) are very dense for certain decades. Shouldn't we be able to construct a Lukewarmer approved method for interpolating the globe? I know that the NCDC analysis does something similar, but would there be benefits to this method?<br /><br />Sorry if this is somewhat off topic.ccehttps://www.blogger.com/profile/03646816472336349526noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-74034674541735914492011-02-19T10:57:07.982+11:002011-02-19T10:57:07.982+11:00Thanks, Ryan, I'll do that.Thanks, Ryan, I'll do that.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-8074473543629929662011-02-19T09:52:45.443+11:002011-02-19T09:52:45.443+11:00Nick,
Exactly. This is why, for the test I did, ...Nick,<br /><br />Exactly. This is why, for the test I did, I spliced the manned and AWS Byrd together (both for our reconstruction and S09). Otherwise, it's almost Q.E.D. that there is no difference.<br /><br />If you splice them together, then you can add trends and expect a response. But you have to splice first.Ryan Ohttp://www.climateaudit.orgnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-50747288470994308422011-02-19T09:50:31.045+11:002011-02-19T09:50:31.045+11:00NicL,
Thanks for that. I'll look at that test....NicL,<br />Thanks for that. I'll look at that test. I've been looking at the test currently done for S09 with Byrd and Russkaya, and I see that Byrd AWS is not in the set usually used for S09. And Byrd manned seems to cover only about the first 15 years from 1957, and Russkaya a similar period. So just adding trend to the existing (manned) data wouldn't be expected to make a big perturbation overall (which is what I'm finding). Then it seems the test must be done including AWS somehow.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-89027294008227682512011-02-19T09:46:06.878+11:002011-02-19T09:46:06.878+11:00KevinC, that is a very good question. It's so...KevinC, that is a very good question. It's something I would have to look at, to be sure. The short answer (really guess) "is yes I think they would", but it would need to be examined more carefully.<br /><br />Cool to hear that Ryan and Nic have already done this. It's the starting place for any sort of realistic evaluation of a method, it's strengths and weaknesses.<br /><br />The next stage of a Monte Carlo analysis is to add realistic signal distortions, like loss of data for part of the year, instrument moves, change of instruments, etc. Ideally, you'd like to put bounds on the error that these introduce to your signal.<br /><br />Finally, you'd probably want to your method to enforce causality in your data. AVHRR almost certainly is going to have an acausal component (I mean in the signal processing sense, not the physics faster than light sense). I achieve this, when needed, by "wavenumber filtering" my signal, and tossing out unphysical/unrealizable portions of the Fourier domain. <br /><br />How well this sort of methodology would work when applied to these sorts of met data would be a research project.Carrickhttps://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-45293123120152126612011-02-19T07:00:45.125+11:002011-02-19T07:00:45.125+11:00Kevin,
I haven't found it to make a differenc...Kevin,<br /><br />I haven't found it to make a difference when infilling other temperature series, be it highly spatially coherent (like Japan) or incoherent (all of North America). It seems to work quite well.Ryan Ohttp://www.climateaudit.orgnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-75461270093159678792011-02-19T06:50:11.929+11:002011-02-19T06:50:11.929+11:00Carrick, Ryan - oh, yes; that sounds like a good i...Carrick, Ryan - oh, yes; that sounds like a good idea. Keeping the temporal amplitudes gives us the right sort of time variation.<br /><br />Do we maintain the spatial correlations too? Yes, I think so, because the range of amplitudes as a function of spatial frequency is also preserved.<br /><br />The only thing you might lose is any implicit phase relationships in the spatial FT corresponding to systematic features in the original spatial data. For example large flat areas (e.g. sea?) lead to relationships between nearby phases in Fourier space. (This comes from the convolution theorem - if you can multiple your 2d dataset by a mask function without changing it, then you can convolute the FT with the FT of the mask. If the mask is low resolution, its FT is concentrated around the origin, so the FT makes relationships between nearby phases.) These phase relationships will be lost. My gut feeling is that it won't make a significant difference though.<br /><br />Kevin CAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-23600673207490785392011-02-19T05:55:41.327+11:002011-02-19T05:55:41.327+11:00Nick, Congratulations on an excellent thread. It ...Nick, Congratulations on an excellent thread. It is good to have a place for dispassionate technical discussions about the math & stats involved. And I like your hyper-linked version of Ryan's code - very user friendly!<br /><br />I have another suggested test that you might like to use for comparing methods for infilling the Antarctic ground station data. You need a station set that includes both Byrd manned station and Byrd AWS. So you could either use the OLMC10 63 station set, or just add Byrd AWS to the S09 42 station set. Then compare the trends of the infilled Byrd manned station and AWS. They should in principle be the same (although their absolute temperatures need not be: the two stations locations are 2.5km apart and they may have different microclimates, sensor heights, etc). <br /><br /> I think that you will find the relationship between the two Byrd station trends at different levels of RegEM TTLS truncation levels quite interesting. Then compare those differences with the difference using iridge.NicLnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-47391671052660113462011-02-19T05:38:24.227+11:002011-02-19T05:38:24.227+11:00Carrick,
Nic Lewis and I have done that with much...Carrick,<br /><br />Nic Lewis and I have done that with much success when evaluating different regression methods for infilling instrumental data. Slick, fast, and preserves both the covariance and autocorrelation structure of the original data.Ryan Ohttp://www.climateaudit.orgnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-68633149136609190892011-02-19T05:05:00.696+11:002011-02-19T05:05:00.696+11:00Kevin C, I'd probably use the AVHRR to constru...Kevin C, I'd probably use the AVHRR to construct say a 2-d spatial Fourier expansion, on a month-by-month basis. Then, compute the Fourier transform from 1982-2006 on a coefficient by coefficient basis.<br /><br />Once you have that, throw away the phase info (keep the same Fourier amplitude) and replace the phase with uniform white noise (bounded by +/- pi). Inverse Fourier transform each coefficient (this will give rise to hopefully realistic time-varying coefficients of the 2-d spatial field), which can then be used to construct a synthetic temperature field.<br /><br />It sounds complicated, but it could probably be done with a reasonably few number of lines of matlab code.Carrickhttps://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-54856302910444976002011-02-19T04:23:33.967+11:002011-02-19T04:23:33.967+11:00Nick, sent an email to what was (I think) your ema...Nick, sent an email to what was (I think) your email address. If you didn't get it, let me know and I'll re-try.Ryan Ohttp://www.climateaudit.orgnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-40159336449638905342011-02-19T02:13:03.071+11:002011-02-19T02:13:03.071+11:00Nick,
You are correct that I am hinting something...Nick,<br /><br />You are correct that I am hinting something else is involved. If you want more than hints, I'm glad to help . . . but if you prefer to arrive at your own answer, I will be quiet.<br /><br />I will only provide one additional hint at this point: kgnd and the number of AVHRR PCs to retain are two different things. For our reconstructions, we retain 135 PCs. The "7" number that has been tossed around all over the blogosphere is not the number of AVHRR PCs, but rather the "n.eof" setting in emFn().Ryan Ohttp://www.climateaudit.orgnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-33715197231755060012011-02-19T01:26:12.525+11:002011-02-19T01:26:12.525+11:00Oh, maybe I get it. Here's a though experiment...Oh, maybe I get it. Here's a though experiment:<br /><br />Suppose we have a planet warming at a uniform rate, with a tilted axis giving rise to seasons. We have a load of weather stations. There is lots of noise due to weather.<br /><br />If we calculate PCs, then hopefully we will see the first two PCs containing some combination of linear increase and annual cycle, and the rest will account for the noise.<br /><br />If we add noise to any station (or to multiple stations), then the first two PCs will be unaffected, but all the noise PCs will get completely shuffled.<br /><br />Is that how it works?<br />Kevin CAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-71211099947440252992011-02-19T01:19:14.142+11:002011-02-19T01:19:14.142+11:00Carrick:
I'm interested in your suggestion of...Carrick:<br /><br />I'm interested in your suggestion of adding noise to a dataset. I might have a use for that. Can you suggest a source (preferably general and not too advanced) where I could read about this approach?<br /><br />Thanks,<br />Kevin CAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-65889479778018126552011-02-19T00:51:46.268+11:002011-02-19T00:51:46.268+11:00I don't understand the algorithms well enough ...I don't understand the algorithms well enough to know if this is relevant - you both almost certainly know this already, but I'll say it just in case...<br /><br />The other main way of objectively establishing how many PCs to retain is cross validation - you leave out some of your data when fitting the PCs to the data, and then see how well the resulting reconstruction matches the omitted data. Vary the number of PCs, and pick the number of PCs which gives the best prediction of the omitted data.<br /><br />You could leave out some stations completely, or you could leave out part of a station history.<br /><br />It may well be that the result is too noisy to make a reliable determination. Then you need to do it multiple times omitting different stations to build up a more accurate estimate of predictive power vs PCs.<br /><br />The limitation of this method is when the data are correlated. e.g. if there are two stations very close together and so highly correlated, leaving out one is uninformative. In this case you may need to sort stations into correlated batches and leave out a batch at a time.<br /><br />That's what we do in our field at least. Whether it applies to yours I leave to you to judge.<br /><br />Kevin C<br />Kevin CAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-22543708714239243132011-02-18T12:38:57.270+11:002011-02-18T12:38:57.270+11:00Nick,
"My next plan (SteveF) is to do the sen...Nick,<br />"My next plan (SteveF) is to do the sensitivity tests you have been describing."<br />That will be very interesting to see; I look forward to it.<br /><br />SteveFAnonymousnoreply@blogger.com