If you’ve spent any time at all reading science and medicine blogs, you know that many of us are quite critical of the way the traditional media covers science. The economics of the business allows for fewer and fewer dedicated science and medical journalists. In the blogosphere, writers have a certain freedom—-the freedom not to be paid, which means that the financial fortunes of our medium (the web) are not directly tied to how many readers I bring in with a headline. But all this is just a lot of words introducing my critique of a recent New York Times article.
The article is titled “Using Science to Sort Claims of Alternative Medicine”. It’s well-written and interesting, but suffers from a fatal flaw (or perhaps just recapitulates it)—like most of us, it fails to take into account how likely (or unlikely) a bizarre medical claim is when evaluating evidence for it.
The author doesn’t realize it, but he points out the fatal flaw in the modus operandi of the National Center for Complementary and Alternative Medicine (NCCAM). Lately, the alternative medicine community has seen some of its bigger trials fall apart.
The alternative medicine community has a few different sects. The largest is the group of various snake-oil salesmen out to make a buck on others’ suffering. Then there is the “supplement industry”. Finally there is the saddest sect—that of real scientists trying to use evidence-based medicine to evaluate improbable claims. These folks mean well, but they’ve picked the wrong tool for the job.
Those who try to design bigger and better studies to study improbable medical claims forget to ask one simple question. Observe:
Now the federal government is working hard to raise the standards of evidence, seeking to distinguish between what is effective, useless and harmful or even dangerous.
“The research has been making steady progress,” said Dr. Josephine P. Briggs, director of the National Center for Complementary and Alternative Medicine, a division of the National Institutes of Health. “It’s reasonably new that rigorous methods are being used to study these health practices.”
That kind of fog [poor study design, false positives, false negatives] is what Dr. Briggs and the National Center for Complementary and Alternative Medicine, with a budget of $122 million this year, are trying to eliminate. Their trials tend to be longer and larger. And if a treatment shows promise, the center extends the trials to many centers, further lowering the odds of false positives and investigator bias.
They are forgetting one critical question: is the treatment being studied even plausible? This question is crucial, as without it, the data are meaningless. Bayes’ Theorem is a statistical way to include plausibility when looking at a clinical trial (and I’ll refer you to the linked source for a more complete explanation). Basically, the lower the prior probability of a result, the more likely a positive result is to be false.
I can’t resist dragging you through a medical example of the effect of prior probability (and without all that pesky math!) so stick with me. One of the most difficult and important diagnostic problems in medicine is pulmonary embolism (PE), in which a blood clot forms in the legs, breaks off, and lodges in the lungs. It affects around 600,000 Americans every year, killing a hefty percentage of them. There have been a number of recent advances in diagnosing PE, but fifteen years ago, before the widespread use of advanced CT technology, it wasn’t at clear how to diagnose this condition in a relatively non-invasive way.
The most accurate method (the “gold standard”) for diagnosing PE is pulmonary angiography, where a catheter is inserted into the pulmonary arteries and dye is introduced while taking X-rays. This is invasive, costly, and time-consuming. It would be great if we could use a combination of patient history, physical exam, and laboratory results to make the diagnosis, but studies have shown that this doesn’t work very well. And in comes a middle ground, the ventilation-perfusion (V/Q) scan. This easy-to-administer nuclear study gives a picture of blood supply and gas exchange in the lungs. A group of investigators designed a large study (PIOPED I) to see how to best use this technology. What they found was that when we combine the results of a V/Q scan with our initial suspicion (prior probability) of PE, we were able to determine the likelihood of the patient having a PE.
Let’s take an example (remembering that this methodology is historical). Let’s say two patients come to the emergency room. I decide, based on clinical criteria, that the first has a high likelihood of having a PE. I get a V/Q scan which is read as “high probability”. The data show that a patient with a high pre-test probability and a high probability scan has about a 96% chance of really having a PE. Another patient comes in, and using the clinical criteria, he has a low pre-test probability of having a PE. His scan comes back as high probability, the same as the first patient. According to the data, the likelihood of Patient 2 having a PE is 56 percent.
Both patients had a test that showed a high-probability of PE, but simply changing how likely we think the patient is to have a PE changes what that high-prob test actually means. That’s a big deal. If I were to rely on only the test results, or only my clinical impression, I would not have any real idea of the likelihood of the patient having a PE.
Now let’s apply this to a CAM study mentioned in the Times article. I’ve reviewed the study before, but let’s add in a dose of Bayes.
Another large study enrolled 570 participants to see if acupuncture provided pain relief and improved function for people with osteoarthritis of the knee. In 2004, it reported positive results. Dr. Brian M. Berman, the study’s director and a professor of medicine at the University of Maryland, said the inquiry “establishes that acupuncture is an effective complement to conventional arthritis treatment.”
Well, in fact, this acupuncture trial does no such thing. It is technically well-designed, in that it is a randomized controlled trial and proper statistical techniques were used. But this is where evidence-based medicine can be wielded improperly. If you think simply crunching numbers is all there is to proving a clinical point, UR DOING IT RONG! (I don’t even need to go into the other real problem with the study, which compared “real” with “sham” acupuncture, but did not include a group that got standard therapy.)
What Bayes’ Theorem teaches us is that the more implausible the hypothesis, the less likely that any numerical data can confirm it. In other words, if you’re testing an idea with little scientific merit to start, it doesn’t really matter how well you design a clinical study, how many patients you enroll, how well you blind it—any positive results are more likely to be due to chance or confounding variables than to any real effect of the treatment. This is the root problem with NCCAM and with the Times article—no matter how many times you check, pigs can’t fly under their own power. Spending money to try to refute this will only create a hole in your wallet (and a lot of dead pigs).