A second crank finds Ioannidis

This time it’s Steve McIntyre representing for the anti-global warming cranks following the HIV/AIDS denialist lead and using John Ioannidis’ study to suggest science is bunk. Never mind that this research is primarily focused on medical studies. Never mind that the study wouldn’t even exist if replication in science didn’t identify in the first place. Cranks like to latch onto anything that they think is embarrassing to science out of the mistaken belief that it makes their nonsense more believable.

It’s funny, I was sure they would have picked up on this stuff years ago, but the critical event was clearly the publication of his findings in the Wall Street Journal (clearly the go-to paper for cranks). However, despite not appearing in the editorial page, it’s still a pretty poor analysis I’m sorry to say. The point of this research isn’t to say that medical research is bunk, or sloppy, the point is to understand that an over-reliance on statistical significance and emphasis on positive results will inevitably result in false-positives entering the literature, through no fault of the authors or the editors.

For an excellent analysis that is easy to understand, check out Alex Tabarrok’s discussion of the research. His analysis also benefits from actually discussing what Ioannidis’ research means for medical science, and how it can reform to diminish these effects.

What can be done about these problems? (Some cribbed straight from Ioannidis and some my own suggestions.)

1) In evaluating any study try to take into account the amount of background noise. That is, remember that the more hypotheses which are tested and the less selection which goes into choosing hypotheses the more likely it is that you are looking at noise.

2) Bigger samples are better. (But note that even big samples won’t help to solve the problems of observational studies which is a whole other problem).

3) Small effects are to be distrusted.

4) Multiple sources and types of evidence are desirable.

5) Evaluate literatures not individual papers.

6) Trust empirical papers which test other people’s theories more than empirical papers which test the author’s theory.

7) As an editor or referee, don’t reject papers that fail to reject the null.

Steven Novella also addresses this research and Tabarrok’s analysis and emphasizes the importance of prior probability in determining whether a study is reliable (Tabarrok’s #1). Simply put, hypotheses shouldn’t be tested simply because they can be thought of at random. There should first be some biological plausibility for the effect. This becomes hysterical when Novella expands his analysis to address what the research says about complementary and alternative medicine studies.

But there are other factors at work as well. Tabbarok points out that the more we can rule out false hypotheses by considering prior probability the more we can limit false positive studies. In medicine, this is difficult. The human machine is complex and it is very difficult to determine on theoretical grounds alone what the net clinical effect is likely to be of any intervention. This leads to the need to test a very high percentage of false hypotheses.

What struck be about Tabbarok’s analysis (which he did not point out directly himself) is that removing the consideration of prior probability will make the problem of false positive studies much worse. This is exactly what so-called complementary and alternative medicine (CAM) tries to do. Often the prior probability of CAM modalities – like homeopathy or therapeutic touch – is essentially zero.

If we extend Tabbarok’s analysis to CAM it becomes obvious that he is describing exactly what we see in the CAM literature – namely a lot of noise with many false-positive results.

Tabbarok also pointed out that the more different researchers there are studying a particular question the more likely it is that someone will find positive results – which can then be cherry picked by supporters. This too is an excellent description of the CAM world.

The implications of Ioannidis’ research, therefore, is not to undermine or abandon scientific medicine, but rather to demonstrate the important of re-introducing prior probability in our evaluation of the medical literature and in deciding what to research. As much as I am in favor of the Evidence-Based Medicine (EBM) movement, it does not consider prior probability. I have said before that this is a grave mistake, and the work of Ioannidis provides statistical support for this. One of the best ways to minimize false positives is to carefully consider the plausibility of the intervention being studied. CAM proponents are deathly afraid of such consideration for they live in the world of infinitesimal probability.

Considering scientific plausibility would also kill, in a single stroke, the National Center for Complementary and Alternative Medicine (NCCAM) – which is ostensibly dedicated to researching medical treatments that have little or know scientific plausibility.

The irony is that cranks have started to cite this article in some snide attempt to disparage science as a whole, but in reality this research is mostly about why you shouldn’t trust crank methods. Instead of cherry-picking results the emphasis should be on literatures. Instead of testing modalities with no biological plausibility and publishing the inevitable 5% of studies showing a false-positive effect, one should have good theory going in for why an effect should occur. Finally, Ioannidis’ study would not exist if it weren’t for the fact that the literature is ultimately self-correcting, and the false-positive effects that inevitably make it in ultimately are identified.