Data and expectations

Sep 10 2012 Published by under Uncategorized

Today I’m going to start a series of experiments for a paper that is almost done. It’s a collaborative paper and there’s a whole bunch of data there already. I’m going to do these experiments hoping that they show something consistent with the rest of the data, because otherwise they won’t go into the paper, which has already been mostly written. To make it even worse: getting the desired results will make me shared first author. Getting different results will make me author somewhere in the middle.

This makes me totally understand why it is sometimes too tempting for someone to make up data.

For me this is not the first time it happens that I have a last experiment that in order to be included in a paper should show certain results: at the end of my PhD we had a story that only ‘needed’ one extra experiment to show that two things were actually connected to each other instead of only a correlation.

I also don’t think we’re the only lab that has this happen. Often we build stories about our findings and then the last experiment for a paper should nicely fit this story. Is it unethical not to include that experiment if it doesn’t fit the story? And other than doing our experiments blind, what can we do to prevent this bias?

Alright, I’m off to do experiments, fingers crossed please!

10 responses so far

  • Namnezia says:

    I think that sometimes the unexpected result is not necessarily because your original theory is wrong, per se, but rather that it is still incomplete. So saving the result for further fleshing out in another study is not necessarily unethical or concealing data. As long as you recognize open gaps and alternative explanations in your original paper, then its fine. Not every paper has to have "the whole story".

  • Dave Bridges says:

    This is the essential tension we sometimes find ourselves in, is overthinking the potential repercussions of not yet definitive results. I think all of us who practice science have a 'best outcome' at least some of the time.

    The concern with a collaborative project like you describe is how to fit in your stuff. If you have a hypothesis contradicting result, is it ever ok to leave it out. If its not your decision, is it ok to still be an author if you know a key piece of information is missing. Can 'hoping' for a result affect the outcome in ways that are not quite fradulent but not quite unbiased. We have become accustomed to think of fraud as image manipulation and making stuff up completely, but these sorts of issues can be just as damaging.

    I think in this case the buck has to stop with the data producer, and not the PI. The scientist knows better than anyone whether there are limitations to the contrary result that should leave it be ignored. The PI could think, well i trust the original data more, and i am skeptical of the new data (or the new scientist) and that might turn out to be correct. I think thats why journals ask that authors agree to all the data, not just the panels they produced. If someone does something wrong and you dont know, then its not really your fault. If you know something is wrong, then its totally on you. Well not you Arlenna, I mean YOU (the reader).

    • babyattachmode says:

      In this case there's actually 3 possible scenarios: 1) data fit with hypothesis and get included, 2) data don't fit with hypothesis but do fit with previously published data, 3) data fit with neither. In case #3 the data will definitely not be included and I have to figure out why I can't reproduce previous stuff from our lab. In case #2 we may include the data, but it will decrease the impact of our story.
      And then there's the possibility that I will do these experiments, but the data are so messy that we can't say anything about them. Then we will also not include them. I will then still be a co-author somewhere in the middle because of all the logistics and oversight stuff that I have done for this project.
      In case #3 I have to think about whether I think I should be on the paper or not.

  • NatC says:

    That is NERVEWRACKING! The only thing that is worse is when that one last experiment is suggested (appropriately) by a reviewer.
    Good luck - and if it does not work out the way you expect, I hope it opens a new and exciting possible interpretation of the rest of the story!

  • drugmonkey says:

    And other than doing our experiments blind, what can we do to prevent this bias?

    Publish in venues in which one discordant result does not prevent acceptance.

    But yes, this is a huge trap for all of us because we are at the mercy of confirmation bias. If the experiment "works" the first time....SCHWEET, submit that sucker! If it doesn't, we re-do, tweak and optimize the experiment to determine what "went wrong". Not good. But I don't have any ready solutions for you....

    • babyattachmode says:

      Yes I guess it's the same for clinical trials for example, where it's sometimes obvious that negative data are not published. I guess it's been discussed multiple times how to find a good solution for that...

  • miko says:

    I am a former master at rationalizing: 1) This result contradicts my model/expectation; 2) I trust the experiment; 3) My model is beautiful and correct; 4) Therefore, some crazy ass shit that makes 1-3 not contradictory.

    It takes concerted cognitive effort to get past this, but it has to become a habit to reject hypotheses, even if you are invested in them. The weirdest part is, when you talk to scientists they all agree it's messy, there are always results that are hard to explain, etc. But when those same people are reviewers, results that don't fit a neat and simplistic narrative can torpedo a whole paper.

    This gets much, much worse when it's your PI's model, which has a life of its own that you don't control in grant proposals and prior pubs, etc.

  • qaz says:

    Science is a process.

    I think its fine to publish what you know. All papers are a snapshot in time. If your experiment were to disprove the other results (to perhaps show that they do not replicate), or to imply that your current hypothesis is wrong, then it definitely would be a problem not to change the paper. But if (what usually happens) your data suggest a new direction to go looking, then that's a good thing.

    Obviously, never make up data. You never want to write something that is not true. But that's not what you're worried about. But this mantra implies things about how to write papers: This is why carefully reporting ALL methods is important. And why it is very important to identify the difference between results and interpretation. A result should be a fact: We did X and Y happened. An interpretation is a best guess: We think X implies Y means that Z is how this works.

    But do not panic if your story is incomplete. Your paper is only one brick in the wall. Make sure all of your statements are true (We did X. We saw Y. We think Z.), but recognize that it's only one brick in that wall.

    It all comes back to the famous quote (from Asimov): "The most important words in science are not 'Eureka', but rather 'Hey, that's funny..."

  • qaz says:

    By the way, there's a great discussion of this on a recent post at masimo's "Exponential Book" blog, where a major result was found eight years later to be based on the incorrect interpretation. (Note the difference between result - these results are replicable with this method - and interpretation - these results imply this phenomenon happens).