Algorithm Finds Thousands of Unknown Drug Interaction Side Effects 121
ananyo writes "An algorithm designed by U.S. scientists to trawl through a plethora of drug interactions has yielded thousands of previously unknown side effects caused by taking drugs in combination (abstract). The work provides a way to sort through the hundreds of thousands of 'adverse events' reported to the U.S. Food and Drug Administration each year. The researchers developed an algorithm that would match data from each drug-exposed patient to a nonexposed control patient with the same condition. The approach automatically corrected for several known sources of bias, including those linked to gender, age and disease. The team then used this method to compile a database of 1,332 drugs and possible side effects that were not listed on the labels for those drugs. The algorithm came up with an average of 329 previously unknown adverse events for each drug — far surpassing the average of 69 side effects listed on most drug labels."
Re:I was wondered about something (Score:5, Informative)
Re:Multiple testing problem? (Score:3, Informative)
Re:I was wondered about something (Score:5, Informative)
As usual, Science&Nature only provide high-level info, so you'll have to dig deeper than the article ( http://stm.sciencemag.org/content/4/125/125ra31.full [sciencemag.org] )
On the authors website, http://www.tatonetti.com/cv.html [tatonetti.com] there is a paper that describes the machine-learning algorithms used:
Tatonetti, N.P., Fernald, G.H. & Altman, R.B. A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports. J Am Med Inform Assoc (2011) DOI:10.1136/amiajnl-2011-000214 [tatonetti.com]
Re:Multiple testing problem? (Score:5, Informative)
"Why not just report likelihoods instead and let the reader multiply it with any prior they want? In many cases, the prior won't make much of a difference anyway, I suppose."
Reporting likelihoods (or rather summary statistics of likelihoods, because generally likelihoods are functions, not single spot values) is precisely what frequentist statistics is. The use of significance tests, p values and so on can be simply considered a standardised representation of the likelihood.
In the case of multiplying with any prior they want, in many (even most...) cases, the reader simply does not know what their prior is. And often, they do not even know, without a lot of work and some experience in statistical theory, what a 'sensible prior' is. For example, setting a flat prior for x (all values are equally likely a priori) can be actually *very* informative if later calculations make use of 1/x.
In most interesting cases, I'm afraid that the choice of prior *does* make a great difference. This is even more complicated if the prior is in the form of hyperparameters, in which case your prior choice can have a dramatic and non-linear effect on your result.
"Sure, do away with empirical Bayes. Anyway, I don't think "When using Bayesian methods, you run the risk of using non-Bayesian methods." is an argument for not using Bayesian methods."
But Empirical Bayes represents a huge chunk of how 'bayesian stats' is done in practice. The point here is that speakers don't define or understand clearly what is, or is not bayesian in the way that is proposed.
"Regardless of what practical obstacles there might be for using Bayesian inference, using something else would be wrong, leading to results that make you take the wrong actions!"
It depends on how the results are represented. Frequentist results are not 'wrong'. They represent a real value of the data. They may simply be irrelevant. And on the flip side, merely switching to Bayesian inference *does not prevent you from taking wrong actions*. What Bayesian inference accomplishes is that it shifts and makes explicit the mistake you are making.
For example, the false positive results I stated can be *exactly duplicated* by using a prior that is excessively permissive of finding positives, and that prior can look extremely reasonable. The real solution to these problems is to *understand what you are doing*, and this can be done in both a bayesian and frequentist way.