Wednesday, March 30, 2011

What's it like to run out of data?

In my previous post, I described a controversy which had been created about Keith Briffa's apparent decision to start a plot of paleo reconstructed NH temperature at 1550, when some earlier data was available. Plotting the data earlier than 1550 showed increasing variations, very likely due to a rapid diminution in the supporting data. I showed plots of the number of sites available, and indeed they did reduce rapidly between 1550 and 1400, when there were only 8 left.

There are at least two reasons why the reduction in data might lead to spurious oscillations. One is simple averaging - fewer sites means greater variance in the mean. The other is geographical spread. I might be able to say something later about spread, but for this post I'm looking at the averaging effect.

I simulated with some artificial data on the actual site histories (from Briffa's 1998 paper). I took a 40-year sinusoid and added AR1 noise (with a 5-year time constant). And I plotted the results below the jump.

Update - I should add that another effect, maybe as important, is the loss of replication within sites. This analysis only covers reduction in sites.



The sinusoid had unit amplitude. On each site I convolved a unit uniform variate (default rnorm) with the function 3*exp(-0.2*i). In choosing these numbers I was trying to best illustrate the effect. I did this 383 times, assigning to each sequence the starting point of one of the Briffa sites (in sequence). The first run produced this plot:

The site starting years are shown in red, and the true sinusoid in green. The common signal, the sinusoid, is coming through fairly well above 1600, but gets ragged earlier on from there. But the raggedness is random - doing it again gives

and again

So that is what is bad about drawing plots when the data runs low. You produce spurious deviations which are not reproducible in artificial data, and are misleading. That's why Briffa didn't do it, as most scientists wouldn't.

Update. RuhRoh in comments has suggested that I should have shown smoothed noise comparable to that in the Briffa/Steve plots. I think that is a good idea. I originally used AR1 noise as a well-recognised kind of red noise, but I'll now use the same filter that Steve did. The formula is
 e1=truncated.gauss.weights(50)*20
The 20 is the factor that regulates the size of the noise relative to the sinusoid. Here are the corresponding 3 plots (random repeats):







Appendix.
Here is my R code. yy is the vector of starting years (minus 1400) (see prev post) and u1 is the ordered version (I could have used u1-1400 instead  of yy). Here is the code:
# Add AR1 noise to a sinusoid
i1=1:550; h=pi/40; s1=sin(i1*h);
 e1=3*exp(-i1*0.2);
 for (k in 1:3){   # Do it 3 times
   M=z=i1*0;   N=383;

   for(i in 1:N){   # Looping over sites
    z1=rnorm(600);
    z2=filter(z1,e1,method="conv",sides=1,circular=T)
# z2 is the AR1 noise
    j1=max(yy[i],1):550;
    K=M[j1]=M[j1]+1;
    z[j1]=z[j1]+(z2[j1]-z[j1])/K;  # Averaging
  }
  jpeg(paste(c("sinew",k,".jpg"),collapse=""));
  plot(1400+i1,z+s1,type="l");    # Output
  lines(u1,1:383/100-2,col="red");    # Site numbers
  lines(c(1400,1950),c(-2,-2),col="red");
  lines(1400+i1,s1,col="green");   #  Sinusoid
  dev.off();
}


























37 comments:

  1. Here Nick, Maps for the following:
    Full Network
    http://www.skepticalscience.com/pics/FullNetwork.png

    Pre-1450
    http://www.skepticalscience.com/pics/Pre-1450.png

    Pre-1551
    http://www.skepticalscience.com/pics/1_Pre-1550.png

    Pre-1600
    http://www.skepticalscience.com/pics/Pre-1600.png

    ReplyDelete
  2. Nick, I counted a factor of two difference in the number of treering proxies 1450 to 1550 (20 versus 40). That's a square-root of two difference in the noise floor, assuming uncorrelated proxies.

    So that is what is bad about drawing plots when the data runs low. You produce spurious deviations which are not reproducible in artificial data, and are misleading. That's why Briffa didn't do it, as most scientists wouldn't.

    Most scientists would tell you they left the data out and why, too.

    ReplyDelete
  3. Here's a curious case of data redaction.

    http://www.digitaljournal.com/article/305096#ixzz1HsjsqUUn

    ReplyDelete
  4. Gee, why are Briffa's plots so smooth compared to your tests?



    Also, are you plotting data here or a model?

    RuhRoh, too lazy to sign up.

    ReplyDelete
  5. RuhRoh,
    The data, as plotted (by Briffa and Steve), have been through a 50-year filter. The data on the file that Steve found are much jumpier even than here.

    I thought later that maybe I should have tried for comparable smoothness. This version emphasises the randomness.

    I guess you'd call it synthetic data. I wanted to emphasise the gradual loss of quality as data diminishes. You can't be absolute about where to stop plotting, but it's clear that if you don't stop somewhere, the results can be quite misleading in terms of the (here known) underlying process.

    ReplyDelete
  6. Right, you thought about matching the smooth of the McI graphic that you were criticizing, but you preferred to emphasize a noisy intermediate result, plotted on similar axes.

    Why not match the 'jumpiness' and then do the wet blanket smooth like Briffa and McI?
    You went to the trouble to plot on the same time-axis and used the Briffa station count.
    Clearly you intend that these graphs be viewed in light of your prior post showing McI recent graphs.

    I think the answer is that you liked the graph you saw and decided that the visual message was more important than apples-to-apples.

    No surprise that as N goes to zero, noise increases. Were you surprised? Why bother with this qualitative exercise? Is it fulfilling to shield believers from doubt?

    RR

    ReplyDelete
  7. Out of curiousity. If you normalized these tree ring series and called them temperature, what kind of output would TempLS create?

    ReplyDelete
  8. Nick, your excuses are opportunistic and inconsistent. Briffa et al 2001 reported a chronology prior to 1550 with the same network. Explain that.

    Jones, Briffa et al 1998 produced a reconstruction in the 11th century with only 3 sites - 2 of which (Tornestrask, POlar Urals) were in the Briffa network. How is that justified if there are too few sites in 1550 to produce a reconstruction?

    ReplyDelete
  9. CCE,
    I had been wondering that myself. I'll try. V2 could potentially give an eigenvalue analysis indicating spatial instability. Unfortunately it is trend oriented at the moment, in the spatial version, and only does one year at a time for the spatial temp distribution.

    The recent area-weighting might also help.

    ReplyDelete
  10. #8 Steve,
    I wrote this post to try to show that, while it is necessary to end the plot at some point, there's no obvious rule to make about where. It depends on what the series is used for. For example, for identifying eruptions as in the Nature paper higher levels of variance can be accepted.

    In the case of the JGR paper that goes back to 1400, they show not just one version, but for comparison eight different versions. They show (Fig 4) clearly divergent behaviour going backward in time, starting at about 1700, and increasing markedly toward 1400. This divergence is discussed in the text.

    In the case of the Jones et al 1998 paper, they do not seem to claim to be compiling a representative NH temperature measure.

    "In this analysis 17 series have been brought together, but, even if large spheres of influence are assumed for each proxy, large areas, particularly in ‘tropical’ regions (between ±40° of latitude), are not represented."

    The reconstruction was not included in Briffa's Science article. The average they compile, NH10, is so named because it includes only 10 series in total. And they make caveats about the reduced resolution before 1500:

    "it must be remembered that for much of the time before 1500 the NH average is composed of only four series and the SH series only three."

    "Emphasis in the discussion of Table 6 has been placed on years after 1500 when more of the proxies are available. It must again be stated that the NH average is based on only four series before this time."

    "Of greater importance to the use of an array of palaeoclimatic reconstructions is that each is probably limited in its ability to reproduce past temperature variations faithfully on the longest of timescales."

    ReplyDelete
  11. Nick, I tend to agree with Steve's comment that your excuse making is somewhat opportunistic.

    More to the point I think you are avoiding any of the ethical issues by focusing on whether one could make a case rather than the ethical issue (associated with responsible conduct in research) of the extent to which should make a case when the post hoc redaction of data has occurred.

    The way you actually build a case, by the way, is by generating uncertainty bounds and showing (perhaps) how the series goes to rot before 1550. It's my personal speculation it does not (and that the problems are in the proxies themselves rather than in the number of series) but I'm willing to say what I think to be the case and proven wrong.

    A criticism can be raised that Steve hasn't shown similar error bounds on the extensions to his curve, but I would say since the original authors left them out themselves, and I believe the purpose of his exercise was to recreate their figure including the redacted data, I think he can be excused for not doing so here. (A figure with a proper estimate of the statistical error bars would be interesting IMO.)

    ReplyDelete
  12. Carrick,
    A problem with this ethics talk is that you and others insist on describing the events in the most prejudicial terms that you can, to the extent that what you say is quite untrue (is that ethical?). And then you say "can't you see that's unethical?".

    At CA, people can only seem to describe the simple act of terminating a time series as deletion of data. You've used the even more bizarre term, redaction. Now redaction, in older times, meant going through a document and blotting things out with heavy black ink. More recently, it may just mean leaving gaps. But the essential features are that you tell people that you've done it, and the information is rendered invisible.

    I can't imagine anywhere in science where data is redacted, so I can't even begin to talk about the ethics of it. Deleted, yes, and that could certainly be unethical (or not). But the elementary fact here is that data was neither deleted nor redacted. It's still there.

    The right way to talk about it is that they didn't present results pre-1550. So they didn't use pre-1550 data. That could be the basis of a reality-grounded ethics discussion.

    As to why the data goes bad, yes, I think numbers of series is only part of the story, as I indicated. Another is internal replication - you just can't get as many cores of good quality etc. But that's all part of being before (eg) 1550 - it's a reason for stopping.

    ReplyDelete
  13. Nick, let's not get distracted with CA here, please, or with what other people "insist" on. How I could be responsible for that, or how that could be relevant to any ethical questions relating to the authors behavior, is beyond me. There are basic questions of scientific conduct, and they are (to me) pretty straightforward. You can disagree with my conclusions, but I think the underlying issues are pretty straightforward.

    Let me start by pointing out that redaction means something completely different than what you apparently think it means, and it is generally a neutral term, it simply means "reduced or edited for publication". That was unquestionably done here in the figure in question.

    Now, is the following true or false?

    A data analysis was performed, part of the data from the analysis were redacted from the figure in question, and no mention of this redaction was made in the accompanying text, let alone a credible explanation offered as to why this redaction of data from their analysis was performed.


    I think the correct answer is true. I believe this is a fair and accurate account of what occurred. They did in fact "hide" part of their data analysis, and they chose not to mention that they hid it.

    If it makes you feel warm and fuzzy you can describe "data from the analysis" as "results", but as the words get used in experimental science, they are interchangeable. The term "experimental data" or "instrumental data" or even "raw data" are used to describe the specific types of data that you were discussing above that weren't deleted (in this instance, in others they were.

    I am using the word data in the technical sense of "information in numerical form that can be digitally transmitted or processed". If this information comes from human sensory data (perhaps converted to a Likert scale), I'd call it "subjective" data. Model output would be "model data" and so forth.

    I think you'll find this common usage in science.

    The ethical question is whether this represents responsible conduct on the part of the researchers.

    Your argument for whether a case could be made for redacting the data from the analysis doesn't address the question of whether the authors should have alerted the reader that the data were redacted, or address the question of whether they should have provided an explanation of why the redaction occurred.

    This is science ethics 101. For me the obvious answer is hells ya.

    That said, there is absolutely nothing wrong with redacting data from a figure, you just have to be specific about what you've done and why, especially if it is central to the theme of the argument you are putting forward at that point in your manuscript.

    ReplyDelete
  14. Carrick,
    I think my view of the matter is the common one - data are what you are given to work with - results are what you produce. And I think stretching the term to describe a calculated average is just intended to be prejudicial. As if they'd been rubbing out experimental data.

    But even that stretched usage isn't correct. As PaulM noted at CA, their plotting program, as is common, had a parameter which just set the start of the region plotted. They set it to 1550. How can that action sensibly be described as the redaction of data.

    ReplyDelete
  15. ps - sorry about the spam filter there - I actually responded before I noticed that only I could see your comment.

    ReplyDelete
  16. Nick, I'd suggest using google to explore the use of the word "data" then. It quite obviously is not limited to experimental data, or to inputs to a process.

    Nor is using the word particularly prejudicially (I'm wondering where you are coming up with this stuff now): "redaction of the results of their analysis" has the same English meaning.

    Redaction can happen in many ways. Setting a parameter in a program to not present data prior to a date is certainly one of them. I could do it by setting xmin on a graph to 1550 also.

    Bottom line:

    They did remove the the part of the series that resulted from their analysis on the graph that they presented, they never stated that they removed that part of the series that resulted from their analysis on the graph that they presented, but from my viewpoint they are required to do so and to explain why, and I would have fired a student had he pulled the same stunt.

    ReplyDelete
  17. > In the study of literature, redaction is a form of editing in which multiple source texts are combined (redacted) and subjected to minor alteration to make them into a single work. Often this is a method of collecting a series of writings on a similar theme and creating a definitive and coherent work.

    http://en.wikipedia.org/wiki/Redaction

    > Redaction generally refers to the editing or blacking out of text in a document, or to the result of such an effort. It is intended to allow the selective disclosure of information in a document while keeping other parts of the document secret. Typically the result is a document that is suitable for publication, or for dissemination to others than the intended audience of the original document. For example, when a document is subpoenaed in a court case, information not specifically relevant to the case at hand is often redacted.

    http://en.wikipedia.org/wiki/Sanitization_(classified_information)

    ReplyDelete
  18. Not Google hits for "redaction of the results of their analysis".

    Four hits for "redaction of the results"

    http://www.google.com/search?q=%22redaction%20of%20the%20results%20of%20their%20analysis%22

    If I have a student who says, on the basis of this Googling, that this expression is not prejudicial at all, I'd consider firing him.

    ReplyDelete
  19. The above comment makes little sense without this other comment that got caught into the spam filter:

    > In the study of literature, redaction is a form of editing in which multiple source texts are combined (redacted) and subjected to minor alteration to make them into a single work. Often this is a method of collecting a series of writings on a similar theme and creating a definitive and coherent work.

    http://en.wikipedia.org/wiki/Redaction

    > Redaction generally refers to the editing or blacking out of text in a document, or to the result of such an effort. It is intended to allow the selective disclosure of information in a document while keeping other parts of the document secret. Typically the result is a document that is suitable for publication, or for dissemination to others than the intended audience of the original document. For example, when a document is subpoenaed in a court case, information not specifically relevant to the case at hand is often redacted.

    http://en.wikipedia.org/wiki/Sanitization_(classified_information)

    ReplyDelete
  20. Nick:

    A couple of Feynman quotes come to mind
    (via http://en.wikiquote.org/wiki/Richard_Feynman )

    * The first principle is that you must not fool yourself, and you are the easiest person to fool.
    -- From lecture "What is and What Should be the Role of Scientific Culture in Modern Society", given at the Galileo Symposium in Italy (1964).
    Variant: Science is a way of trying not to fool yourself. The first principle is that you must not fool yourself, and you are the easiest person to fool.

    *There is one feature I notice that is generally missing in "cargo cult science." It's a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty — a kind of leaning over backwards. For example, if you're doing an experiment, you should report everything that you think might make it invalid — not only what you think is right about it; other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked — to make sure the other fellow can tell they have been eliminated.
    Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can — if you know anything at all wrong, or possibly wrong — to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it. There is also a more subtle problem. When you have put a lot of ideas together to make an elaborate theory, you want to make sure, when explaining what it fits, that those things it fits are not just the things that gave you the idea for the theory; but that the finished theory makes something else come out right, in addition.
    In summary, the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another.

    * "Cargo Cult Science", adapted from a commencement address given at Caltech (1974)

    In general, the problems Feynman identifies come up a LOT in Climate Science, in my view.

    Cordially,
    Peter D. Tillman
    Consulting Geologist, Arizona and New Mexico (USA)

    ReplyDelete
  21. Sorry about the spam filter again, folks - I'll post on this, and am strongly considering giving up on Blogger.

    ReplyDelete
  22. Peter,
    It's easy to pontificate on how everything should be spelt out. But you and Carrick don't acknowledge the practicalities of how to convey the essential science readably. Briffa had one page to work with here. It was a survey note in Science's perspectives section. It's just not possible to divert into a discussion of exactly why the endpoints of the various plotted segments were chosen.

    But there are limitations even in less compressed forums. As I keep wearily saying, every time series has a beginning and an end. Sometimes it is ended because the region of interest has a natural limit - sometimes by lack of data, as here. How often do you actually see that spelt out? Examples?

    ReplyDelete
  23. Willard - I agree with you of course. Sorry your argument was messed about with the spam filter. Something will be done about that real soon.

    ReplyDelete
  24. Willard I found about 11,000,000 hits for redaction of data (that was my original phrase that Nick objected to, I'm not using quoted version, because you have to search for "redaction of * data", to match the types of data redacted, good luck doing that in google), involving of various types of data, and about 8,000,000 for redaction of results. If you insist on only perfectly matched expression, "data redaction" for example had 66,000+ hits and has the same meaning as "redaction of data". That has identical meaning to "redaction of data" of course. As does "information redaction" and "redaction of information".

    The terms are mostly used in the context of "hiding" information you don't want the reader to see. Nick suggests it is appropriate to do so in this case, because showing the results would mislead the reader. I'm suggesting under certain circumstances it is appropriate to "redact" part of that figure, but that "certain circumstances" includes an explanation of why the data were redacted.

    If you don't like the word redacted (it's a perfectly good English word and exactly describes what happened with respect to that figure), maybe you could suggest an alternative choice of phrase, unless avoiding a discussion of the responsible conduct of research is your underlying goal.

    Nick, I've written review articles (including one appearing in Nature) and I know what's involved, so you can spare the lecture on how "difficult" it is to meet your obligations to the reader when presenting your own original research results within the review article, as Briffa did in that research article.

    ReplyDelete
  25. Carrick,

    My G box gives me a bit more than 5k hits for "redaction of data". I'm not sure how you got your 11k, but the use of the quotes matters. The fourth one is this very blog post.

    Whether I like or not the word "redacted" is immaterial here. The meaning of "redaction of data" is quite clear. Most of the hits of the first pages relate to the removal of personal information or redundancies or what not. So the connotations are quite obvious for anyone with experience redacted documents.

    Neither can you [REDACTED] your way out of this by shifting your burden on my shoulders. Perhaps since I'm just being skeptical, I don't need to propose anything constructive. Or perhaps if your criticism can't be worded with less loaded words, perhaps we should just drop the case. Et cetera.

    ReplyDelete
  26. williard, I'm not posting any burden on your shoulders because there's nothing wrong with the word redacted here. All I said was IF you thought that was an inappropriate word choice (I don't think it is or that a BETTER word choice was available, you COULD suggest another word.

    Too difficult?

    Regarding word counts:
    65k hits for "data redaction." Certainly more than "4". Can we get a statement from Willard that if a student of his came and said the total count was 4, he'd fire him? ;-)

    Once again (and slightly tidied up):

    The term "redaction of data" is mostly used in the context of "hiding" information you don't want the reader to see. Nick suggests it is appropriate to do so in this case, because showing the results would mislead the reader. I'm suggesting under certain circumstances it is appropriate to "redact" part of that figure, but that "certain circumstances" includes an explanation of why the data were redacted.

    I'd suggest you find the word "loaded" because it suggests something that you're not comfortable admitting to and nothing more. Criminy...Nick was objecting to the word "data" too. Is that too loaded for you also?

    I'd suggest unless you can produce a little less loaded response of your own, that we let this thread wind to zero. I've no energy to waste on people who aren't interested in being intellectual honest.

    ReplyDelete
  27. Carrick,

    I'm having trouble reconciling this:

    > I'm not posting any burden on your shoulders because there's nothing wrong with the word redacted here.

    forgetting for the moment that it begs the question, and this:

    > If you don't like the word redacted (it's a perfectly good English word and exactly describes what happened with respect to that figure), maybe you could suggest an alternative choice of phrase, unless avoiding a discussion of the responsible conduct of research is your underlying goal.

    which again begged the question, a point that is not our concern for now.

    ***

    Again, the readers only needs to look at the first pages of the hits for "data redaction" to see that the usage you're promoting here is not the most natural one. In fact, one can easily see that the explanation that you're offering is metaphorical at best. We're far from an engineer-level derivation of the meaning you're promoting.

    These two points should suffice to show that you are in no position to handwave to intellectual honesty.

    ReplyDelete
  28. Carrick provides the ultimate Mosher move. Unless you concede that the science is corrupt, we really can't have any further dialog.

    Climate Audit provides the standard for intellectual honesty.

    Nick provides a reasonable analysis of the paleoclimate literature that has been elevated to another moment of high dudgeon (a pimple on the ass ) by the never ending auditors. Carrick raises the discussion to "ethics".

    The science must be corrupt!

    ReplyDelete
  29. Williard saying that

    > I'm not posting any burden on your shoulders because there's nothing wrong with the word redacted here.

    is stating an opinion, not begging a question (I'm not basing an argument over my choice of the word redacted, or my belief that it is a reasonable choice of language in this context).

    You and Nick are employing legalistic stalling tactics to avoid actually discussing the issue at hand (whether whatever "non-loaded" description of Briffa did was acceptable academic practice), and you question my "intellectual honesty"???

    I'd laugh but there's nothing funny about this, it's just pathetic.

    ReplyDelete
  30. Paul, you've completely misrepresented my views and position on this (no surprise here, if somebody criticizes a climate scientist they are obviously the "enemy" and must be attacked and ridiculed rather than reasoned with).

    Here are my views:

    a) I don't agree that what Briffa did in this case was ethical in removing/deleting/redacting/editing/changing a parameter to not show a portion of his data.

    b) I don't think there is anything wrong with not showing a portion of the data, but one must discuss why the data weren't shown, and make an effective case for this (e.g., too few points).

    c) I don't personally find the proxy data very exciting, because they don't pertain to the question of what happens when you release an enormous amount of CO2 into the atmosphere (the PETM would be a better example of that).

    d) Ethics is always an issue in science. How do do a thing "right" is something that comes up endlessly.

    e) Whether Briffa acted inappropriately has nothing to do with the behavior of climate scientists in general. We can conclude one way or the other on this, on the merits of the argument, without trying to smear that into a generalized statement about the practice of climate science.

    What we really have here is a "circle the wagon" community that refuses to ever admit error, and in doing so, IMO, damages their credibility far greater than just admitting that occasionally lapses in judgement do occur.

    Trying to through people over board because they don't sign on to the class climate science circle the wagon maneuver, isn't a particularly brilliant strategy to gain converts either. Just saying....

    ReplyDelete
  31. Make that "trying to throw".... Wife is calling. Bed. Night.

    ReplyDelete
  32. The plethora of memes Carrick just bloated deserves due diligence. But since Carrick insists, we can talk a bit about shifting the burden of proof. This won't take long.

    The question I was raising was about the choice of the word "redacted". If Carrick assumes that he's right about the adequacy of this word, for instance by shifting the burden of proof on my shoulders, he's begging the question.

    Here is what Carrick used this trick here:

    > If you don't like the word redacted (it's a perfectly good English word and exactly describes what happened with respect to that figure), maybe you could suggest an alternative choice of phrase, unless avoiding a discussion of the responsible conduct of research is your underlying goal.

    To see how this trick works, let's take a simple example. Suppose Carrick says that God exists. Suppose that I disagree. Now, suppose Carrick also believes that "God does exist" represents his opinion, an opinion he finds reasonable. Does that mean that I have to prove that God not exist? Not if I'm an agnostic.

    Let the reader decide who's using the "legalistic stalling tactic", here and elsewhere. And let the reader decide how to interpret this refusal "to admit an error" by throwing all kinds of insults around.

    Science is corrupt.

    PS: I concede that the way I formulate my comments are loaded. I can own it, and I will. What I write under this name is my honor.

    ReplyDelete
  33. Williard, once again it's my opinion the word "redacted" is appropriate here. It's my opinion, so it is what it is. If I believed in God, what you believe as an agnostic has no impact on that belief. What's left to argue with on that? (Other than you came up with a completely awful analogy to make whatever point you were flailing around for, and which really you were unable to actually make.)

    Once again, Mr. Moving Goal Posts, when I say "redacting of data", I am referring to the actions take by Briffa et al: They hid part of the results of their analysis ("their data") from the reader, and intended for the reader to not be aware that this data had been hidden. That's an example of a redaction of data in my book. Look up the word, it has plain meaning, your efforts at obfuscation not withstanding.

    The question I'm interested in, isn't over the word choice, but over the ethics of the action, and here, truthfully, I think what is really getting your knickers in a knot isn't word choice, but the description of actions they willfully performed.

    The idea that they like slipped on a banana peel and accidentally cut out part of their results---or any lame excuse like that---is of course silly on the face of it. Pathetic even, as are defenses of their action as if it were inconsequential. Figures like this take quite a bit of work to produce, obviously careful thought went into the decision to truncate the series in 1550. Hand waving doesn't make that fact go away.

    As to the point of the challenge for you to come up with an alternative expression (which you are still batting zero at): It is simple logic, that if "redaction of data" is a less than ideal language choice to you, then you should be able to produce a better term to describe it, were that actually available.. That's not begging the question or shifting responsibility, it's simply requiring that you be responsible for the implications of your own arguments.

    You obviously haven't found a better word choice, and so have spent an enormous amount of your cognitive resources avoiding doing that... the total verbiage you've generated in rhetoric is indeed quite telling: Arguing over phrasing rather than delving into the meat of the subject is certainly legalistic maneuvering at its core. I doubt anybody who is objective on this matter would disagree with that conclusion.

    Science isn't corrupt, IMO, but people are human and make mistakes. The process corrects their errors, but only if the process is open and transparent.

    Paleoclimatology went through a period where certain (what I believe to be) unethical behaviors became not only tolerated but accepted. It's safe to say we've moved beyond that, so beating the dead horse can stop on that at any point. People come clean and admit that this happened, but as they say "the choice is yours."

    Cheers. Done. Out-a-here. This is as much energy as I'm willing to expend for so little substance offered on your part.

    ReplyDelete
  34. Hi to the Willard perched on top of the written words that are his honor.

    ReplyDelete
  35. Sure, paucity of data is an issue. But the Science 1999 reconstruction was a first attempt at a "low frequency" reconstruction using a large tree-ring network. All subsequent efforts focused on attempting to combine the same data in a way that provided adequate verification skill for the pre-1550 portion of the reconstruction. For various reasons the simple averaging of ABD site chronologies first employed by Briffa could not do this (and, yes, spatial overrepresentation by the Eastern Siberian region was problematic). Subsequent efforts employed "prior averaging of the age-banded density series into the nine subregions".

    A little more commentary here:
    http://deepclimate.org/2011/04/06/open-thread-9/#comment-8539

    ReplyDelete
  36. Deep,
    Yes, I can believe there are other factors. My point here is that even an optimal distribution of sites would experience problems.

    I hope you do post on the issue. I read the ABD nine-regions account, and I couldn't see why grouping could make a large difference. Especially when the number of sites becomes less than the number of regions.

    ReplyDelete
  37. Nick, I've not seen commentary from Briffa for why he cuts off the pre-1550 data, but another perfectly reasonable explanation would be prior to circa 1550, the trees in question weren't located in tree lines, and that as the NH cooled, the tree line shifted "downward" to place the trees in a temperature-limited growth pattern. It could even be that Briffa, having access to such data, would have known this.

    Leaving aside ethics issues (hopefully for say at least 1 year), it would have been nice if, as an expert, he would have addressed the divergence in his own series prior to 1550, even if he had not shown them on his graph. Given the general flavor of his commentary, I still find that a bit surprising.

    ReplyDelete