## Science and the Shortcomings of Statistics 429

Kilrah_il writes

*"The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"*
## Example: Standard Deviation (Score:5, Interesting)

## Personal experience (Score:5, Interesting)

As a doctor myself, I feel I should add my $0.02...

Throughout med school we had the odd scattered lecture on statistics, and later when reading papers I used to skim over most of the maths just to look for the P value at the end (one representation of how statistically significant a result is).

However, I then took a formal stats course and was amazed at how little I understood - Monte Carlo techniques, Markov models, and even something as trivial yet important as the difference between a parametric versus a non-parametric test.

And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.

So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.

-Nano.

## Rats and Stats (Score:1, Interesting)

When I did my BA in psychology Statistics was the core of the degree. It was the one subject that you could not escape and had to take for the full year every year of the degree. I heard later that the Psychology department at that Uni was sometimes disparagingly described as teaching Rats and Stats psychology.

## bad title (Score:5, Interesting)

It is not a shortcoming of the Copenhagen interpretation of quantum mechanics or the Chicago school of economics if I don't understand or know how to correctly interpret their results. It is my shortcoming and fault for not knowing enough to connect the dots.

I do statistical research some of that is through interacting with researchers in the biosciences. Often when I go to talk to a researcher and ask them if they could use some statistical or mathematical or computational assistance with their research it has almost always been a fruitful starting point to long conversations and getting into the research. Now sometimes it was simply a matter of looking at their F-test results or ANOVA scores and telling them what it meant (like with a regression model relating proportions of certain characteristics between taxa), more useful interactions for me often mean working on new algorithms or estimators or working with fitting a model from their empirical data because there isn't a reliable standard model to work off of (like intergenic distance between genes in an operon) that kind of challenge makes less engaging work worth the hassle. Maybe I'm odd because I've worked hard to have a good background in both statistics and biology, but I shouldn't be.

Although here is an observation that perhaps supports some of the intent of the article from my own experience. I was speaking with a biology graduate student and it came up that they had a biostatistics course in the department. Of course as a statistician my mind goes towards survival function, failure rate, life tables, censored data, bioassy, epidemiology, microarrays, clincal trials, topics along those lines. It turned out their course focused z tests, t tests, f tests, confidence intervals, point predictions, least squares regression, multiple regression, ANOVA, and things along these lines just with simulated problems in a lab setting. That is not necessarily a bad thing, but much of the core math was under played or missing like model assumptions and alternate formulations or things like dummy variables. The worst part was that even though they were doing well with the class they had no confidence in actually using the statistics and didn't understand how to interpret the meaning of something like a confidence interval, they knew how to calculate one, but it wasn't clear what it actually meant to them.

The corollary to the notion in the summary I'd rant and claim is that scientists overall have less than desirable skills in mathematics, statistics, and computation than those who studied those disciplines principally and that's hurting science. However many in those three disciplines really know little beyond basic results in any of the sciences which hurts the applicability of these mathematical fields to the sciences and likely hurt our ability to develop certain types of discipline specific results that can be generalized from work in application problems.

In either case whether you're a typical scientist or a typical math/stat/comp person in order to become proficient enough in the other areas it requires going an awfully long out of the way compared to any counterpart who simply does not care and goes straight through as many before have. While in some areas of research on either side it is no problem to do as has been done and not further knowledge into those other areas. Increasingly results that have the highest levels of impact are coming more and more from truly interdisciplinary research. In order to further encourage that for those who are interested in such fields (aside from making more clear what areas in any of the fields fringe to such interdisciplinary work) we need more incentive to study more than one field and/or better ways of enabling fruitful cooperation between the camps.

## Re:Long winded troll (Score:1, Interesting)

Evidence. Big difference.

## Re:Example: Standard Deviation (Score:2, Interesting)

I certainly don't remember how to do all those statistics calculations by hand but I use SAS and excel almost every day and they don't seem to have forgotten...give me a few more years and I might be at the point where I wouldn't be confident trying to explain what a standard deviation actually "is"

## only in medicine (Score:5, Interesting)

## Re:The problem is statisticians (Score:5, Interesting)

Actually, one of the most dangerous uses of statistics is exactly predicting with them inappropriately. Curve fitting is especially prone to this error- attempting to make any predictions outside of the central mass of the points used to *produce* the curve is completely bogus, and yet people do it all the time.

## Re:Example: Standard Deviation (Score:5, Interesting)

Doctors are notoriously bad with statistics. But the real kings of bad statistics are psychiatrists.

Notice how a LOT of studies in psychiatry are essentially statistics, statistics and a bit of statistics? It might be the reason why a lot of the courses you have to pass to become a shrink also consist of a lot of statistics, statistics... you get the idea.

NOBODY who decides that his course of studies would be psychiatry decided for that because he enjoy statistics that much, though. Actually, most psych students struggle badly with statistics. Psychiatry is one of the fields where the label doesn't match the contents. It

lookslike you're going to do a lot of messing with people's minds (aka "solving their psychology problems") but actually, judging from the courses, you become a refined statistician who had a bit of a counceling tutoring on the side.That's not what people become shrinks for, though. They want to sit in their office, put people on their couch (or, more modern, in a comfy chair) and get 100 bucks an hour for listening to some idiot whine. And most do just that and will do fine.

It gets bizarre when they somehow end up in a spot where they have to rely on their statistics. Hey, you got a masters in that, and that entails a buttload of statistics, so you can do it... Nobody really cares that 9 out of 10 that somehow managed to get their diploma by either learning what they absolutely needed (and forgot it right after the test, certain that they'd never need it again, because ... ya know, listening to idiots and stuff, not sitting there plotting standard deviations...) or by cribbing altogether.

And then you get studies of the usefulness of psychotropic drugs and wonder whose black hole they pulled that out of...

## Re:Example: Standard Deviation (Score:4, Interesting)

Back when I was in graduate school me and my colleagues in graduate science taught pre-med chemistry and physics, which was a really watered down version of chemistry and physics which were taught to engineers and science majors. To be honest I thought it was kind of scary. All these years I was taught that medical student were supposed to be the best and the brightest, but we spoon fed them "baby chemistry" and "baby physics".

Since that time I have had many discussions with professors about this and they and I have come to the same conclusion, "the best and the brightest do not go into medical school". Thirty or forty years ago this may have been true, but economics has taken a turn and it just isn't the case anymore.

And why would they? They can make more money on Wall Street, they don't have to hassle with bureaucracy of health insurance, they don't have to hassle with lawyers, so why would the best and brightest go into medicine.

And you want to know what kind of income a hot little girl with a business degree can get. Pharmaceutical sales can pay 6 figures for one good figure. So the next time you see that good looking girl pulling that bag through your doctors office realize she is probably making a lot of money. More money than the average general practitioner .

## Re:Personal experience (Score:4, Interesting)

Even ten to fifteen years ago, students in Statistics courses had very little computer exposure, and that of course means any practical analyses would imply the use of approximations - hence the widespread use of chi squared tests and normal distributions for everything, whether appropriate or not.

If the statistician -> textbook -> student/scientist -> textbook -> scientist process is factored in, I have no doubt that it will take another generation or two before the old style of statistics is replaced sufficiently widely to be only a memory.

## Re:Example: Standard Deviation (Score:3, Interesting)

And then you get studies of the usefulness of psychotropic drugs and wonder whose black hole they pulled that out of...

Indeed. Normally I would never cite an article in a McNews magazine like Time or Newsweek, but I found this explanation of the state of antidepressant drug efficacy to be one of the best I've run across so far - hundreds of billions of dollars all depending on some really, really bad math. Its like the collateralized debt securities of the drug & psychiatric industries:

http://www.newsweek.com/id/232781 [newsweek.com]

## Re:Personal experience (Score:3, Interesting)

I think the term "Statistics" has become too general that people don't understand how complicated it can be. People think of Statistics as Bob saying to Alice - "Get me the stats on this weeks' sales." Alice just goes digging around and gets Bob the total sales in $, # of units sold .... etc. People don't understand or know of the concepts that are involved in polling, they just thing they called 2,000 random people and that's it. That's statistics to the public and many college graduates.

I had to take a few stats courses for my BA. Learning stats is humbling - I know I really know nothing about it now after taking a few courses. The classes I took assumed you have no inkling of the basics of calculus or algebra (much past grade 9 level). I didn't know any calculus - I took some 1000 level Algebra but, after graduating a few years ago, I'm teaching my self Calc now and I'm realizing how much less I really understood about Stats at the time.

When people really don't understand the underlying mathematical principles they shouldn't use SPSS or Excel. Heck, if you ask people what 2+2 is, they know the answer. But you tell them to apply X or Y to such and such data set with SPSS, they probably won't investigate the results. Print it out with the report. Done! If you use a Stats program you should understand what your data means, what is happening to your data, what it means when X is applied to your data and what the end result means. I don't think a lot of people are humble enough to say they don't really understand. Little white lies!

## Looking for a good book on statistics (Score:4, Interesting)

I'm interested in learning the essentials of statistics. What would be a good book to start me out?

I got The Manga Guide to Statistics [nostarch.com] and it did introduce me to the very basics. However, there are many places where it just gives you an equation, without deriving it or even explaining it. After reading this book, I now know how to calculate standard deviation, but I'm still a bit vague on how people actually use it. I would like to see some examples of how people use statistics in (for example) science experiments.

My ideal book would explain the basics, with examples, and show how the math works. Ideally it wouldn't be a thousand pages long, either, but that's a secondary consideration.

Recommendations, please?

P.S. Those of you who know about statistics: how good are the Wikipedia pages on statistics?

steveha

## Re:What it actually said (Score:3, Interesting)

"Contrary to the parent poster's claim, the article does not focus on correlation vs causation. It focuses on people getting the correlation wrong in the first place."Fair point, I only skimmed the TFA but I still stand by my assertion that it's a troll of the "scientists don't understand statistics" genre, it even starts by claiming statistics is a "mutant form of math". Had they ommitted that drivel and not refrenced discredited papers then maybe I would have read the whole thing.

## Re:Example: Standard Deviation (Score:5, Interesting)

There are some things you should never be able to forge.... Do people forget basic definitions so easily?Given a couple years with little contact with people who speak your native language, you'll actually begin to forget that very language you have lived speaking all your life. So it doesn't surprise me at all that people would forget basic definitions if they don't actually think about those definitions very often.

I figure if you can forget your native language then pretty much all bets are off for the stuff you've known for a lot less time and used a much smaller percentage of your thinking life.

## The use and abuse of statistics. (Score:4, Interesting)

I'm actually at a scientific meeting and saw 7 presentations in which they "double dipped" on their statisitics before we broke for lunch.

Double-dipping is bad enough, but the medical field is rife with multiple-dipping. Each dataset is plumbed to test dozens of hypotheses, without appropriately adjusting the acceptance criteria. Even with separate datasets, if you test 20 hypotheses and discover that each one is just valid at the 95% confidence level, then there is a very good chance that there are some false positives. In the medical alleged-sciences, however, all 20 would be blindly proclaimed as truth.

And then there are the social nonsenses^W sciences... If practitioners of some discipline do not understand how to use quantitative methods, they should limit themselves to qualitative argument only. Unfortunately, in statistics as in other fields, those who are ignorant or incompetent are generally unaware of the extent of their ignorance and incompetence.

## Re:No surprise here (Score:3, Interesting)

## Re:Lies, Damned Lies, and Statistics. (Score:3, Interesting)

I saw a fascinating presentation by an eminent professor of physics on what Meadow did wrong. It boiled down to mis-applying bayes' theorem. Meadow had got an extremely high probability of the accused being guilty out of it, what the professor did was poit out that (a) the probability put in for chance of two babies dying couldn't be taken by simply multiplying the chance of one dying by itself as the event may not be independent (b) that would be rather moot because the accused's chance of having 2 dead babies was 1.

Putting the correct numbers in and turning the handle produces a chance of guilt of about 1%.

What was most shocking was that, given the elementary error in the application of statistics, no-one called him on it in court.

## Re:only in medicine (Score:3, Interesting)

I've had my name included on several 'hard science' papers that had horrible statistical assumptions. I fought, and lost, because my professor had a big grant to maintain, and nobody else understood the underlying assumptions (we used an absolute scaling function, guaranteeing that our distribution was not normal, then tried to assume that it was normal). The second half of my thesis refutes the math in the last three papers I was on. Not one single person who read it understood it, which is sad because it wasn't actually all that impressive.

The only reason I'm not completely ashamed to admit that is that the bad stats don't actually change the conclusions in this case. They do invalidate the confidence intervals, though...

The training in stats required for 'hard science' is essentially nil. Most of the hard science folks I know who are not into high-end mathematical modeling just assume a normal distribution for their data, do a bit of analysis, and publish. I was in an analytical chemistry lab, where that sort of thing normally works, and to a very high precision. However, we were working with sloppy biological assays, where being within a factor of two is a miracle. Under those conditions, you need to know a lot more statistics.

Basically, the people who know enough math are working on well defined systems and theories, and the medical and biological communities don't know much math at all, but are working on very sloppy systems that need a lot of math to analyze correctly. It is therefore easier to spot the mistakes in those communities, but don't assume they aren't there in the 'hard science' papers.