Monday, 22 July 2013

Curing cancer in everything but humans

If, the next time you read an article that claims that a cure for cancer is just around the corner, you will be forgiven if you don't rush out to tell your friends and family the good news. After all, it seems like such announcements are made fairly regularly; meanwhile oncology wards aren't exactly closing down due to lack of patients.

You could repeat this example indefinitely; we still don't have cures for Alzheimer's or Parkinson's; we don't know what causes autism (or even if there is a cause); we still can't halt aging. All this despite the regular headlines telling us that such things are just around the corner. What gives?

A big part of the problem here comes from the fact that, when it comes to doing medicine in humans, there are two types of studies: the type we would like to perform, and the type we actually get to. The type we would like to do goes something like this: take two groups of people with a disease. Treat one group with the drug you want to test, and don't treat the other group. Keep everything else the same, to the point of giving the control group fake treatments so that the experience isn't different. At the end, tally up how everyone did, and see if the drug worked. Or, to take a slightly different formulation, take two groups of people. Expose one group to the agent that you think causes a certain disease. Don't expose the other group, but keep everything else the same, to the point of exposing them to a fake agent. At the end, tally up how everyone did, and see if the agent caused the disease.

The problem is, of course, that it is usually completely unethical to do this type of study with actual people. Obviously you can't go around deliberately exposing to things that you think might cause terrible diseases in the name of science, and you also can't deny people older, at least partially effective treatments because you need a control group to test your newer, better treatment. So scientists fall back on two ways of getting around this. One, you could do the test on animals. Two, you could look back at records of who had radium watches, or smoked cigarettes, or consumed excess vitamin C, and then correlate that with the rates of leukaemia, or lung cancer, or long life.

There are, of course, practical downsides to each. Animal testing, which can in principle be done with our rigorous ideal study design, has the problem that you don't know that humans will react the same way as the animals, and the only way to know is to do another study, which brings us back to the original problem. Correlating patient histories, which involves actual humans, has the disadvantage that you don't have controls; you don't know, for example, that the people who consumed excess vitamin C weren't simply the type of people who followed all kinds of health fads, in which case they may also have been, compared to the general public, less likely to live near power lines, more likely to rub their skin with olive oil, more likely to eat organic foods, etc, etc.

John Ioannidis looks at statistical issues around both animal studies and "look-back" studies. His research is, to say the least, a little disturbing, considering how often these types of studies are reported in the media. In a now-classic paper, "Why most published research findings are false," Ioannidis points out that most look-back studies ignore the roads not taken in their analysis. What he meant was that if you do a study on, say, the connection between aspartame and Alzheimer's (which made headlines in the '90's), you need to look at all of the other things that you didn't test that were, in principle, just as likely to be connected. This is because study conclusions are typically reported with what's known as the significance; the probability that the effect observed could have arisen randomly. Significance is reported basically because it's what we can calculate easily. But the problem with look-back studies is that if there were, say, 50 different things that were as likely as aspartame to be connected to Alzheimer's, then a significance of 0.05 (which is a typical value) becomes inconclusive. Because now you have a 1 in 20 chance of getting the effect you saw by chance, but there were 50 different relationships you could have tested, so odds are you'd get 2 or 3 positive results just by randomness. Unfortunately it's usually pretty difficult to estimate how many different things are as likely to be connected with Alzheimer's (this is known as the prior, or prior probability, and it's a pretty endemic problem across science). So most studies don't report it. Which means that they may be drastically overestimating the strength of their conclusions.

Ioannidis has also looked at animal studies. In a paper published last week in PLoS Biology, he asks the following question: if you perform some large number of studies, how many do you expect to see come back with a positive result (ie, the medicine worked). He works out a statistical argument for this expected number, then compares to available databases in which people have reported both positive and negative results from animal tests. What he sees there is that the observed positive results are way higher than the expected. Hence, researchers are, for whatever reason, more likely to report positive results than negative results. This is a problem because of the last section: in order to get an idea of the prior for a given relationship, we need to know how many similar studies have turned up negative results. If the negative results are unreported, again, studies end up drastically overestimating the strength of their conclusions.

This leaves us in rather a bad situation. Not only do look-back and animal studies have built-in limitations, which tend to get glossed over in media reports looking to make an impact, but the studies themselves are overestimating how conclusive the data they report is. The end result is the first paragraph of this post: a relentless stream of articles promising potential breakthroughs that never quite pan out.

What's the solution to this? There may not be a simple one. Better appreciation of study design and limitations, both by researchers and science communicators. Most importantly, we need to design more integrity into biomedical studies. There are places where research teams can register studies before starting them, thus removing an important source of bias. Such efforts are voluntary at the moment; national governments and healthcare bodies could make them compulsory. And scicomm blogs could work to make sure that each time someone reads a headline that says "Cancer cured in ...", they skeptically say, but what type of study was it?

No comments:

Post a Comment