I must say I’ve loved much of the writing at the new blog Science-Based Medicine. These guys are fighting the good fight and presenting very sophisticated aspects of evaluating the medical literature in a very accessible way. In particular I’d like to point out David Gorski’s critique of NCCAM and the directly-relevant articles from Kimball Atwood on the importance of prior probability in evaluating medical research. I mention these as a pair because lately I’ve really become highly attuned to this issue due to the research of John Ioannidis which is critical for understanding which evidence in the literature is high-quality and likely to be true. Atwood rightly points out that pre-study odds, or prior probability is critical for understanding how the literature gets contaminated with nonsense. Stated simply, the emphasis on statistical significance in evidence based medicine is unfortunate because statistical significance is ultimately an inadequate measure of the likelihood of a result being true.
The scenario goes like this. You have an test, let’s say, the efficacy of magnets in increasing circulation in rats. Because magnets are believed to have some health benefit according to some snake oil salesmen, you and 99 other researchers decide to put this to the test in your rat-based assay. Based on chance alone, as many as 5% of you may get a statistically significant result in your studies that appeared real simply due to chance. 95 of you will then say, “oh well, nuts to this” and shove the data in the file drawer to be forgotten. The other 5% may then say, “wow, look at that” and go ahead and try to publish your results. This is what is known as the file-drawer effect. Positive results get published, negative results do not, thus false positive results, especially ones with big effects will often sneak into the literature. Luckily science has a self-correcting mechanism that requires replication, but since we don’t delete the initial studies, they will always be there for the cranks to access and wave about.
This makes two things very important. One is the importance of replication and the evaluation of the totality of the literature rather than a single report. Two is the critical importance of pre-study odds. You don’t even need to be an expert in Bayesian statistics to figure out how to compute this, it’s just common sense. Before an experiment is performed one should ask is there a good rationale for the experiment? Is there a reasonable physiological or physical basis for your hypothesis or are you just setting yourself up to report a false positive? These are questions good researchers ask all the time because they protect you from getting fooled by randomness.
This feeds into why I liked Dr. Gorski’s piece on NCCAM so much and why it changed my mind on the National Center for Complementary and Alternative Medicine. I used to think that money spent on NCCAM, while not ideal, at least was subjecting CAM claims to scientific inquiry and it benefited from being run by a legitimate set of scientists, notably the late Stephen Strauss. At worst it was just a boondoggle and we might get some interesting results out of it.
Now, I’m convinced, and in no small part by the Ioannidis, Atwood, and Gorski that NCCAM can not help but be a fundamentally-flawed endeavor that will ultimately contaminate the literature with nonsense and noise. The pre-study odds of CAM modalities are exceedingly poor, however if you study snake-oil long enough, even if it’s doing nothing, eventually you’ll end up with some positive results that you can publish. Since negative studies don’t get published all you see is a contamination of the literature with false positives. NCCAM isn’t just a benign waste of money, it has the potential to contaminate the literature with nonsense that will never go away.
Uggh. It gives me shivers.
Finally a negative comment about Evidence-Based Medicine, and that is a pair of articles from Wallace Sampson, who usually has his head screwed on right, that disappointed me. They are the Iraqi Civilian War Dead Scandal, and its follow-up which I believe fail to meet the standards of a blog that wishes to represent science and evidence-based writing. First of all, the word “scandal” is way over the top in evaluating the Lancet articles that used sampling to estimate the increasing death rate in Iraq after the US invasion. There is no scandal. There may be controversy, but not scandal. Second, Sampson attacks the paper with conspiracy theories, some outrageous allegations of falsification, and one of my favorite crank attacks – armchair math. Always beware when someone takes on some complicated scientific theory or result with armchair math, it’s usually a sign you’re reading Uncommon Descent. Tim Lambert and others in the comments (including his co-bloggers) show Sampson to be way out of line, and making truly incorrect claims about these studies, probably due to his reading poor information sources (read liars) writing about these studies. I’m not calling this crankery yet, but I would like to have seen his follow up be a little contrite about the excesses of his first article. I’ll just say for now that these articles by Sampson fall below my standards for scientific writing or for a critique of the scientific literature. They can do better and I think Sampson can do better.