tag:blogger.com,1999:blog-7729093380675162051.post2487049355876138757..comments2024-03-28T13:56:47.604+11:00Comments on moyhu: GHCN processing algorithmNick Stokeshttp://www.blogger.com/profile/06377413236983002873noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-7729093380675162051.post-65962142300225791312010-04-01T08:13:06.608+11:002010-04-01T08:13:06.608+11:00Roman, thanks for your comment there. I saw the re...Roman, thanks for your comment there. I saw the restriction of G(y) to annual as reducing the number of equations, but I can now see that if I don't, the larger set could be partitioned into 12 subsets, which would likely be easier overall. I'll try it.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-42941921953725781902010-04-01T06:59:37.648+11:002010-04-01T06:59:37.648+11:00Nick, I have been thinking about your model. It is...Nick, I have been thinking about your model. It is basically the one that I have used except that you have chosen a single "global temperature" for each year rather than each month as I have done. I am not sure I understand why.<br /><br />It will certainly not simplify the complexity of the calculations. The number of values you are estimating at one time is enormous.<br /><br />If there are S stations and Y years, then you have 12S + Y parameters. With 3000 stations and 100 years, this comes out as 36100 parameters with matrix operations on a square matrix with that many rows and the same number of columns.<br /><br /><br />On the other hand, if the "global" value is monthly, the sum of squares splits apart into twelve separate SS's, each of which is minimized separately with only S + Y parameters for each month (3100 in the example) but repeated 12 times - a process considerably less complicated and much quicker than that above.<br /><br />I haven't done the math yet, but I suspect that your yearly value would also be a weighted average of the twelve monthly values from my procedure.<br /><br />With numbers of parameters of the order involved here, using the R lm procedure would not be efficient and would require a great deal of computer memory. However, it is possible to write reasonably simple scripts for this purpose (as you seem to have done here).<br /><br />Good on ya for working on the problem with the rest of us sloggers!RomanMhttp://statpad.wordpress.com/noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-38888448148160956062010-04-01T06:22:49.085+11:002010-04-01T06:22:49.085+11:00I use the terms interchangeably.
Gridding accom...I use the terms interchangeably. <br /><br />Gridding accomplishes area weighting; area weighting is hard to do without drawing a grid of some sort, at some point.carrot eaternoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-75896083855321682152010-04-01T05:20:42.476+11:002010-04-01T05:20:42.476+11:00Carrot Eater, I would say gridding is a form of ar...Carrot Eater, I would say gridding is a form of area weighting, not the other way around.Carrickhttps://www.blogger.com/profile/03476050886656768837noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-86671504580724670812010-04-01T00:45:11.206+11:002010-04-01T00:45:11.206+11:00I wouldn't say gridding is now minor; it's...I wouldn't say gridding is now minor; it's just being done all at once through the weights, instead of going box-to-box and then combining boxes. But it's still being done. If rerunning the calc isn't cumbersome, I guess having to repeat it for hemisphere/regional calcs isn't so bad.<br /><br />Ah, I didn't see you that you addressed my exact concern under 'solving'. Sorry.carrot eaternoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-69184236352351650182010-03-31T23:43:16.998+11:002010-03-31T23:43:16.998+11:00CE
Gridding is now a minor part of the process - j...CE<br />Gridding is now a minor part of the process - just determines weights. The weights are valid for subsets too.<br /><br />I'll write a hemisphere restriction into the next version. There are many ways it could be done. It's no big deal to rerun the calc, because most time is spent in solving the design matrix system. More soecifically, in computing the explicit Schur complement (which I am hoping to avoid in a speedup).<br /><br />Yes, as I mentioned under solving, the matrix has a rank deficiency of 1, which I solve by setting G(1) to zero.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-79142758421226588302010-03-31T22:24:39.631+11:002010-03-31T22:24:39.631+11:00So the gridding is basically implicitly embedded w...So the gridding is basically implicitly embedded within the calculation. Which is fine if all you want is some sort of 'global' record, but what if you want hemispheric records or regional records? Seems to me that you'd have to filter the source data and run the script again.<br /><br />I put 'global' in scare quotes as there will be empty or undersampled grid boxes. It's important to remember that the G(y) has this limitation.<br /><br />I haven't thought through the degrees of freedom carefully yet, but wouldn't there be multiple solutions here? Meaning, G(y) could be shifted up or down, and the offsets L would merely shift with them. Seems to be that you'd have to specify the offsets L at one station somewhere, as GISS and Tamino do. What am I missing?carrot eaternoreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-16425316532130159642010-03-31T17:07:50.769+11:002010-03-31T17:07:50.769+11:00Thanks, Carrick and Steve
If anyone tried to down...Thanks, Carrick and Steve<br /><br />If anyone tried to download the code, I had left in a if(F){ ..} ##F which actually blocked out all the calcs - I was scrubbing up the graphics. It's OK now.<br /><br />I've created a <a href="http://drop.io/erfitad#" rel="nofollow">drop.io doc repository</a> (which is what Zeke uses). Currently there's only the numbers for the G output, but I'll put up a zipfile of code and my math description of the algorithm.Nick Stokeshttps://www.blogger.com/profile/06377413236983002873noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-24274441641816250162010-03-31T16:45:49.630+11:002010-03-31T16:45:49.630+11:00Seconded. Nice work NickSeconded. Nice work Nickstevenhttps://www.blogger.com/profile/06920897530071011399noreply@blogger.comtag:blogger.com,1999:blog-7729093380675162051.post-22080531072045831142010-03-31T16:27:01.194+11:002010-03-31T16:27:01.194+11:00Hi Nick,
Looks very interesting. Could you put t...Hi Nick,<br /><br />Looks very interesting. Could you put together a tar bar of your source files? <br /><br />And maybe version number (and name) for the project.<br /><br />Great work!Carrickhttps://www.blogger.com/profile/03476050886656768837noreply@blogger.com