Why most published research findings are false

Seed magazine profiles the recent work from John Ioannidis, author of the groundbreaking article “Why most published research findings are false”.

I’ve written about him before in several contexts and the importance of understanding this research. The counter-intuitive thing is how much his research redeems science as an enterprise and emphasizes how denialists can abuse our literature.

I recommend that scientists take a chance to read some of his work, and ideally watch this video (it’s a lot more approachable) I uploaded to google a few months ago. It is a bit long – it’s the grand rounds he delivered at NIH a few months ago – but well worth the time if you’re working in, or interested in the results of biological research.

It’s really fascinating stuff, and as someone who is always harping about error and statistics in my lab, a welcome wake-up call to biologists to understand the meaning of statistical significance, and the importance of skepticism of new results until they’re broadly verified.

Simply put, what Ioannidis does, is he takes the most cited articles from a certain time period, then, look 20 years down the line to see which of these highly-cited, groundbreaking articles has held up. A simple breakdown of the findings in biological fields is that if you start with 100 groundbreaking papers that promise immediate clinical translation, about 27 of them have results really hold up very well and result in clinical trials, about 5 of them will result in actual licensed treatments for people (a sign of successful transitional research), and about 1 of them will result in a technology that revolutionizes medical treatment.

He also shows a bias towards initial big effects. Frequently the first paper showing a new finding reports a really big, highly significant effect. However, as other researchers study the problem there is a rapid correction, sometimes completely canceling out the initial observation, but usually settling on a much more moderate effects.

Ultimately science is redeemed, since the data that shows that many initial findings are exaggerated or overblown comes from subsequent attempts at replication from other scientists. The system works. But it emphasizes many things that are important to us here at denialism blog. It shows the importance of knowing the difference between statistical significance and statistical power of experiments. Also it’s important to not immediately accept everything that comes out of even prominent journals, as one of the critical elements of the scientific enterprise is replication replication, replication. Skepticism of even high-profile research is critical until results hold up under replication. Ideally, the response to research like this would be to emphasize trial design in judging the worthiness of publication of a result – the big journals are always biased instead towards big splashy results – and more of a willingness to publish negative data from well-designed trials.

Finally, and most importantly, it shows that the literature is full of missteps, and those that would misuse science can always find studies of weak statistical power showing the effect they’re interested in promoting. It is important in science never to cherry pick the study that you want to be true, but to consider the totality of the scientific literature before drawing conclusions.

9 thoughts on “Why most published research findings are false”

  1. I have a real problem with this claim. What he means is that most published research findings in one area of medicine are false. I seriously doubt the applicability of hsi findings to most fields of science. I’ve found as many published errors in my field as anyone, and I can count the real errors I’ve identified on the fingers of one hand.

    So if Ioannidis’s claim is itself false, it seems we have a modern variant of the ‘All Cretans are Liars’ paradox.

    (Hey, Ioannidis is a Greek name. It isn’t Cretan, is it?)

  2. Yes, I guess it’s not clear from that title, which was clearly crafted for maximum impact, that he’s speaking about biomed. There are probably fields that are worse, and many that are better, but unlike the physical sciences we sadly lack perfect mathematical underpinnings for our systems.

  3. Choosing highly cited articles biases his sample towards mistakes. Many of the articles attracted extra attention because they were surprising, and they were surprising because they were wrong. So we’d expect the error rate among all articles to be lower.

    The real message here seems to be that one should never assume a paper must be right just because a couple of reviewers didn’t see anything obviously wrong with it. In experimental areas we can only be really confident in hindsight.

  4. Unfortunately, many investigators DO assume that a paper must be right because it was published in a high-profile journal. Today, the problem is compounded because splashy articles are immediately featured as one of the “ten most downloaded” (or whatever) on journal websites, leading to a positive feedback loop of even more buzz and exposure. The perceived sexiness of the results then leads some investigators to propose experiments, sometimes in multiple grants, along the same line of inquiry. When a grad student or tech in one of these labs fails to get “good” results–i.e. ones consistent with those in the sexy paper–it’s the tech’s fault since it’s inconceivable that anything from the lab of Mr. “I publish in Cell five times a year” could possibly be wrong. Then after a few years it’s finally decided that actually REPLICATING the results in the original paper might have been a good idea. Lo and behold, they don’t replicate well in anyone’s hands. And at conferences, there are whispers about multiple labs having had this problem already. Of course, no high-profile articles come out in contradiction of the original article. The original paper is just ignored and replaced by the newest startling claims from the same authors.
    This (surely hypothetical) scenario illustrates that low standards in science publishing can lead to a tremendous waste of resources. Sure, scientists will eventually do the experiments to prove the original claims false or exaggerated, but it would have been better to do good science to start with. Yet, how to impose consequences for poor-quality (if high-profile) papers without stifling creativity and progress?

  5. Unfortunately, many investigators DO assume that a paper must be right because it was published in a high-profile journal. Today, the problem is compounded because splashy articles are immediately featured as one of the “ten most downloaded” (or whatever) on journal websites, leading to a positive feedback loop of even more buzz and exposure. The perceived sexiness of the results then leads some investigators to propose experiments, sometimes in multiple grants, along the same line of inquiry. When a grad student or tech in one of these labs fails to get “good” results–i.e. ones consistent with those in the sexy paper–it’s the tech’s fault since it’s inconceivable that anything from the lab of Mr. “I publish in Cell five times a year” could possibly be wrong. Then after a few years it’s finally decided that actually REPLICATING the results in the original paper might have been a good idea. Lo and behold, they don’t replicate well in anyone’s hands. And at conferences, there are whispers about multiple labs having had this problem already. Of course, no high-profile articles come out in contradiction of the original article. The original paper is just ignored and replaced by the newest startling claims from the same authors.
    This (surely hypothetical) scenario illustrates that low standards in science publishing can lead to a tremendous waste of resources. Sure, scientists will eventually do the experiments to prove the original claims false or exaggerated, but it would have been better to do good science to start with. Yet, how to impose consequences for poor-quality (if high-profile) papers without stifling creativity and progress?

  6. I’d love to see how the error rates break down among journals. As the paper says, the hottest fields are the ones most likely to get wrong results published. Of course, the hottest fields are also the ones which publish the most in the high-impact journals, so does that suggest that a Cell or Nature paper may be more likely to be wrong than one in JBC or PLoS?

  7. Mr. Gunn,
    Yes, the higher the impact the higher the retraction rate.

    see writedit for one cite. My fav is: Nath et al. Med J Aust. 2006 Aug 7;185(3):152-4.

    the flacks associated with higher impact pubs always seem to argue that the higher level of scrutiny is at fault (hyp: a higher detection rate for the same fake/error rate).

    my hyp: The incredibly high payoff for Science/Nature/Cell publication has a greater tendency to compromise ethics.

  8. I actually commented on Ioannidis’ work nearly two years ago. I’m not sure much has changed since then …hmm maybe but ı cant understand.. thanks

Comments are closed.