## Science and the Shortcomings of Statistics 429

Posted
by
samzenpus

from the 14%-of-people-know-that-statistics-can-prove-anything dept.

from the 14%-of-people-know-that-statistics-can-prove-anything dept.

Kilrah_il writes

*"The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"*
## Re:Long winded troll (Score:1, Insightful)

Statistics is terrible for proving things, but rather good at disproving them.

## Statistical assumptions are often ignored (Score:1, Insightful)

Statistical methods are typically developed for fairly specific mathematical models. A practitioner may error greatly by using a statistical method outside of its intended purview. For example, many statistical tests assume that different groups of observations are independent or correlated in a specific way. If this isn't true then the resulting inferences can be very inaccurate.

Unfortunately the spread of "easy to use" statistical software is making this problem worse. Many scientists just enter their data and select an analysis from a drop-down menu - thinking that just because their data is in the right format that the results will accurate. It would be better if people had to think about what analysis to choose rather than just treating the choice of a test like the choice of a visual effect in photoshop.

IAAS (statistician), for what it's worth...

## The problem is statisticians (Score:5, Insightful)

In other news math may not lie but people still can...Usually (in science at least) it's not even a matter of lying. Part of the problem is that the multi-headed monster that statistics has become has a tendency to lead people to over-use numerical "answers" vomited up by stats packages, without really understanding what they are for, or how to interpret them.

Statistics are very useful for

predictingcertain things, but all too often they are submitted as "proof" of a given condition, which is dangerous. Sometimes we need to throw away statistics and start applying common sense.## Re:Lies, Damned Lies, and Statistics. (Score:2, Insightful)

in the end, it's only a problem if the person listening is an idiot...

## Re:Long winded troll (Score:0, Insightful)

No it can't. The article does a fairly good job at summarizing the systematic conceptual mistake of misinterpreting a p-value as representing a probability that the hypothesis is not true, among other things, and a somewhat less good job at introducing Bayesian statistics. These are subtler issues than the true-but-trivial—and tiresome—cliché you refer to.

## Re:No surprise here (Score:5, Insightful)

You are a jerk.

You are insulting your sister because she is bad at mental math? It is a skill; one not required for extensive knowledge of the social sciences. Additionally, maybe if sales tax is simple in your state like 10%, but where I live it is 4.5% which is not always easy to get exactly right in your head.

I had a roommate who was brilliant,funny, a singer and an artist, and yet, he couldn't calculate tip to save his life, but I don't certainly hold that against him.

## Excellent (Score:2, Insightful)

## Re:Example: Standard Deviation (Score:5, Insightful)

## Re:Personal experience (Score:5, Insightful)

...And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.

So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.

That's a good insight. I'm a statistics professor, and some of the problems I see are a) people generally get exposed to a single course in statistics; b) they're usually mathematically unprepared for it; c) so much gets squeezed into that one opportunity that heads are exploding; d) because of (a) - (c), everybody wants you to "just give 'em the formula"; e) since statistics is so widely used, there's a plethora of courses that are being taught by people who themselves are victims/products of (a) - (d), and are very happy to "just give 'em the formula"; and so e) most people plug and chug data through a stats package with no idea of the applicability, limitations, and interpretation of the results. The sheer volume of bad analyses is enough to make you weep, and contributes to the widely held perception about "lies, damned lies, and statistics". And that completely ignores the intentional falsehoods propagated by people who are trying to support various advocacy viewpoints, and will happily mislead the public with biased samples, Simpson's paradox [wikipedia.org], invalid assumptions, etc.

## Re:Its common knowledge (Score:4, Insightful)

And 77.335% of all statistics claim more accuracy than their expected deviation warrants.

## Re:Example: Standard Deviation (Score:2, Insightful)

As a statistics teacher (HS / Tech school level), this doesn't surprise me in the least. Statistics and statistics education has become a giant game of "plug the numbers in and damn the understanding". When a student has never calculated a standard deviation by hand, how can they be expected to know what the heck a root mean square deviation from the sample mean really is?

Going further, I would say that statistics is a tool for answering questions. Like any other tool, it works well for some jobs and not for others. So far, no problem. But the problem comes from students that are just not willing to understand the questions that statistics can answer. Case in point -- a p value of 0.05 does _not_ mean that the null hypothesis has a 95% chance of being wrong. That's what stats students want it to mean, because they are not willing to ask the questions that stats can answer.

Until students are willing to actually do the work, for the sake of actually learning, I don't see any hope.

## Re:Example: Standard Deviation (Score:5, Insightful)

## Re:Example: Standard Deviation (Score:4, Insightful)

Standard deviation is what you learn very early in school.So early in fact that by you forget the details by the time you have had some serious study under your belt. Do you have any idea of the stuff you have to keep in your head to be an endocrinologist? So long as he remembers that it's a measure of variance (which he obviously does), it hardly matters whether he can explain to a mathematician how to derive it? And if OP gets off tripping up specialists with such minutae it ain't the specialist who has issues.

And you are telling me that it's not his "job" to know?YMMV, but I would prefer to visit an endocrinologist who was an expert on the subject of hormones etc rather than stats.

## Re:Long winded troll (Score:4, Insightful)

Actually the subtler issue here has nothing to do with statistics, they are implying peer-review does not work.

"Peer review" is another of the things that has been over-sold to the public. A science research group spends six months and a hundred thousand dollars conducting a research study using highly specialised equipement. They submit a paper to an academic conference or a small journal. It gets put out to review by three people who each spend about four hours reading it and reviewing it, and who usually do not have access to the equipment or the original data that was used in the study. Do you really think we're likely to catch every mistake at review? We certainly can't check the stats (except for the most egregious errors) because we don't have the full data tables they analyzed.

Scientists actually accept that inevitably some incorrect results will be published. More often in the smaller conferences than in the most prestigious journals, but even the journals have to publish a retraction every now and then. We also accept that most studies are never repeated, and so the "objective repeatable experiment" is rarely really tested for being either objective or repeatable. However, science has long had the "many eyes" effect at work. There are hundreds of thousands of scientists reading papers and using them in our own experiments. If some theorised effect out there is wrong, usually we'll find out eventually.

## Re:What it actually said (Score:5, Insightful)

The result, repeatedly proven mathematically and by experience, is that the magic number is always Signal-to-Noise-Ratio. You can't get good information from crappy, scant, data.

Humanities and social-"science" types, and unfortunately the med school set, are by and large composed of people with varying degrees of pathological fear of mathematics, computation, and computer programming. I'd be willing to bet that a largish portion of even the post-PhD scientists who 'know' how to make a proper calculation for a statistical test don't really understand the physical meaning of the numbers they're copying and pasting in and out of excel.

When your attention and skill set are focused on looking through a microscope, or cutting up lab rats, or synthesizing chemicals, you probably never have the experience of being up to your eyeballs in noise estimates and P_FA's that bludgeon in the fact that your data really sucks because it's too noisy, and never need to answer fundamental questions like 'what's the probability that the ruskies will fire off a missile and this radar won't see it'/[insert biologically relevant example here], which *requires* learning the right way to do statistics.

## Re:Example: Standard Deviation (Score:4, Insightful)

If MD's are reading medical journals and interpreting their results, which they all are expected to do (especially those with a Board Certified Specialty like Endocrinology) then there is no excuse for them to have forgotten what what the standard deviation is a measure of. They should be using the variance estimates provided in a data table to interpret the results it contains every time they read an article. If not, then they aren't worth the exorbinant fee's they are charging, because critical thinking is part of a physicians job description, and accepting whatever gets publish in the New England Journal of Medicine at face value is not.

I can accept forgetting the equation, but there is NO EXCUSE for forgetting that SD is a measure of varition (along with SEM, SED, and CV) as opposed to a measure of central tendancy (mean, mode, median). That is something they teach you in the first week of a statistics course, and is used every subsequent class because it is so fundamental to the interpretation of statistics. If I were cytoman, I'd be looking for a new Endocrinologist.

## Re:What it actually said (Score:3, Insightful)

Good summary, but I call bullshit on the article. Most of the problems you mention and the others in the article are common popular misinterpretations of statistical results, but that doesn't mean they're common mistakes made by researchers in the studies themselves. Any rookie peer-reviewer would spot them immediately if they ever make it into a manuscript.

This doesn't mean that there aren't a lot of bad statistics-based studies out there, especially in medicine. But the problems are usually much more subtle than the article implies. Standard statistical methods require many regularity and sampling assumptions to be valid, and a lot of times researchers take these assumptions for granted when even a little probing would show that they're violated. A lot of advances in recent econometrics have been in the development of robust methods (valid when standard assumptions are violated), and those advances unfortunately take a long time to filter down to the 'applied researcher' level. If you're an applied researcher, it's generally unlikely you'll use statistical advances you didn't learn as a grad student.

And frankly, I have no idea what the Frequentist/Bayesian debate has to do with any of this. To suggest that using Bayesian methods is some sort of solution for the problems listed in the article is ridiculous.

## Re:The problem is statisticians (Score:4, Insightful)

Many times the answer that "just can't be right" is; the problem comes when we "throw away the statistics" instead of figuring out why and how it gave the answer it did.I've adopted in my life a truism I learned from my flight training: deal with things as they are, not how we would wish them to be.

In my work in network security, I often come across some oddities, which I present to management. They can present some uncomfortable episodes, and management sometimes wishes to just sweep them under the rug instead of addressing the problems. Now that we have a newly-upgraded IDS, we're seeing things that we never noticed before, and I suspect that we're going to be getting new guidelines on what is important.

I hope that's just cynicism leaking through the rum, but I've been there long enough to thing it might be reality instead.

## Re:Lies, Damned Lies, and Statistics. (Score:2, Insightful)

there are statistically two popes per square kilometer in the vatican.

One does not do statistics on single data points. Statistics are for estimating results in large numbers of cases using smaller numbers of cases. When there is only one case, we use an entirely different means of reporting data. It is called a measurement. There is one vatican. Results would be reported as X per vatican. By direct measure, the answer is one pope per vatican. Your own result reflects your incorrect thinking. There is no square kilometer in one vatican. One does not use a unit of measure of area for a target smaller than that unit.

## Re:Lies, Damned Lies, and Statistics. (Score:4, Insightful)

The problem is that a lot of people believe statistics produced by an expert such as a doctor. Sri Roy Meadow [meactionuk.org.uk] had people sent to prison, and lots of children taken away from their parents, by misinterpreting statistics.

## Significance is NOT probabilty (Score:2, Insightful)

.. or at least not the probability of the hypothesis. This is one of the errors that people make. Having 0.95 significance do NOT imply having 95% chance for the hypothesis being true! The significance is the probability of the test outcome assuming the hypothesis is true (in other words it is a likelihood value). You have to multiply it by a prior to obtain real probabilities.

Significance values will not even add up to 1 over the two hypothesises!

The root of the problem is that frequentists can not use probabilities for statements -- only for events. In frequentist terms you have to have a sigma algebra over some Omega state space which is measurable. Bayesians on the other hand can talk about the probabilities of any statements using probability theory as an extension of formal logic. I really recommend reading the books of E. T Jeynes and David McKay.

Other false assumptions people make with statistics:

- Everything is normally distributed

- Everything has a variance

- Everything has an expected value

- Hypothesis testing is without bias (in fact it is equivalent to give 50% prior probability to both hypothesises)

- Variance means average distance from mean

- Empirical variance does not have a variance

## Re:Example: Standard Deviation (Score:3, Insightful)

Except, if you had read this story, you would have found that the antidepressant = placebo story to be incorrect due to poor statistical reasoning:

"Another concern is the common strategy of combining results from many trials into a single “meta-analysis,” a study of studies. In a single trial with relatively few participants, statistical tests may not detect small but real and possibly important effects. In principle, combining smaller studies to create a larger sample would allow the tests to detect such small effects. But statistical techniques for doing so are valid only if certain criteria are met. For one thing, all the studies conducted on the drug must be included — published and unpublished. And all the studies should have been performed in a similar way, using the same protocols, definitions, types of patients and doses. When combining studies with differences, it is necessary first to show that those differences would not affect the analysis, Goodman notes, but that seldom happens. “That’s not a formal part of most meta-analyses,” he says.

Meta-analyses have produced many controversial conclusions. Common claims that antidepressants work no better than placebos, for example, are based on meta-analyses that do not conform to the criteria that would confer validity. "

## Re:Example: Standard Deviation (Score:2, Insightful)

Have you ever looked at the sheer amount of knowledge that doctors have to know (and actually do know)? Yes they are learning baby physics and baby chemistry. We have physicists and chemists to do the non-baby physics and chemistry. You could also say the same thing for other sciences. I know friends who've taught physics to chemistry students and that was baby-physics which the chemists struggled to understand. Similarly I've had to learn chemistry for my physics degree and have pretty much forgotten almost anything about it, that didn't prevent me from getting a PhD in physics.

I wonder how much of the physics or chemistry out of your field of expertise you still remember.

Back to the topic of doctors, a lot of the stuff that doctors do is purely knowing things, but they need to do a lot of it. They don't necessarily know exactly how a drug works, they just know when to give that drug. So a lot of their work could be done with a very big flowchart, except for the fact that quite a lot is actually observation not just what you tell them.

## Re:Example: Standard Deviation (Score:4, Insightful)

There's a reason why you keep getting modded up and those disagreeing with you keep getting modded down.

You're exactly right. Modern diagnostic medicine is predicated on interpreting statistical studies to make diagnoses. It is practically incompetence for a practicing medical doctor to not know what standard deviation means.

## Re:Lies, Damned Lies, and Statistics. (Score:1, Insightful)

But a real correlation (i.e. not a fluke)

doesimply causation, it just doesn't tell you where the causation coming from.## Re:The use and abuse of statistics. (Score:5, Insightful)

Has it ever been demonstrated that social scientists have a worse understanding of statistics than physical scientists? I ask because my observations are the opposite. The physical scientists run a t-test and declare the matter resolved (significant or not-significant). Given the complexities of social sciences, these scientists check the assumptions required to use a test (e.g., normalcy) and have a good understanding of the statistics involved. (The obligatory exception is statistical genetics: physical science with a solid statistical basis.)

## Re:There are lies, damn lies... (Score:3, Insightful)

## Re:only in medicine (Score:2, Insightful)

Give some examples. I mean, real, specific examples of mathematical practices or mathematical theories that are invalid and why they are such. Based on what you said, my suspicion is you are basing your claim on a smattering of slashdot comments and no understanding of any of the physics you are referring to. Several points give you away:

1) You speak of physics but your two vague examples are (I'm guessing because your description is almost unrecognizable) renormalization theory, and string theory. You, and many others besides, forget that the many of physics sub-disciplines are not directly unconcerned with the former, and almost no one outside of high energy physics is involved in the latter. In other words, your examples leave out the bulk of physics being done.

2) Renormalization theory involves demonstrating that apparent divergences will exactly cancel. You do not just discard them. There was a saying that was popular in the 50's when people were developing the mathematical foundations for it: "Just because it is infinite, does not mean it is zero!". It was an extremely important milestone when Freeman Dyson showed in the early 50's that all such divergences - obeying certain, explicit criteria - occurring in quantum electrodynamics were renormalizable. In case you weren't paying attention, Dyson was a mathematician. In the following decades a lot of work was done to explore the mathematical properties of renormalizable theories, contrary to your assertion.

Now many theories are not - in the strict mathematical sense - renormalizable. In these cases, cutting off divergences is physically meaningful(condensed matter physics, where matter is discrete at small length scales), or physicists actively and openly discuss and search for ways to formulate theories that possess no divergences or are strictly renormalizable. One may also ask, what if the correct theory is *not* renormalizable? In other words, what if our theory, while mathematically sound, is physically inaccurate (which is the opposite of the bizzare paradigm you suggest)? This is something actively discussed (and even widely assumed) in the search for new physics, but if true, the effects are too small to be currently detectable. In other words, we are back to discarding things because they are small, which is standard practice.

3) String theory - which again, is actually a very small part of physics - is actually almost entirely mathematical, which you concede. The mathematics is fine; the question is what, if anything, does it actually mean? Your criticism makes no sense here - are you suggesting by having math taking over the physics, the math becomes bad?

4) You put accurate in quotes, as if to suggest it was a dubious claim. This is disingenuous - in fields where a physicist is liable to claim this, it is demonstrably true; theories are able to predict many constants (such as the magnetic moment of the electron) to experimental precision. Many general, quantitative phenomena that are predicted as a result of the mathematics have been experimentally verified. (BCS superconductivity, Bose-Einstein condensates, Bohm-Aharanov effect, Quantum hall effect, etc).

5) More generally physics has often been less then mathematically rigorous as new theories are developed and refined. Calculus - the basis for Newtonian physics - was not put on firm mathematical footing until the 19th century. And even then the intuitive form of calculus that Newton and Leibniz were thinking of was not formally developed until the 1960's(nonstandard analysis). Part of the maturation of physical theories is the introduction of mathematically rigorous foundations.

Seriously, make some specific claims rather than casting blanket aspersions. What physical theories today lack rigorous mathematical underpinning that physicists ignore?

## Re:What it actually said (Score:1, Insightful)

I'm not a PhD and what you said is blatantly obvious to me from my 2 years in biology. That said, I've seen PhD's with 30 years experience engaged in magical thinking when it comes to numbers. It really does take an engineer to fully appreciate "garbage in, garbage out."

To give you insight to some of the things one does in biology, take a machine that you have only a very very basic understanding of. (Let's say, you only know it spits out a piece of paper that says "0.452" when you put something red in it). Now create an application and make it do something useful without killing anybody. The end result is, you'll get a thing which you still know almost nothing about, but can be statistically shown to be safe and give you the desired result under stringient conditions. That is until it doesn't for reasons which you may never completely understand. A radar is easy by comparison, because it isn't a black box. You know all the bits, you know how the bits are connected, and you should be able to replicate conditions easily. I can't exactly open up a cell and see everything that is going on.

## Re:Summery? (Score:2, Insightful)

No idea, but there must be a law about people assuming that editing Slashdot signature doesn't affect posts made previous to the edit.

## Re:Example: Standard Deviation (Score:3, Insightful)

I believe that forgetting something usually means you never really understood it. I don't think that if you really understand something, you'll ever forget it. There are things I do rather rarely yet I don't forget them because I understand them and could re-derive them from first principles.

Usually if I forget something, it means I never quite understood it in the first place.

I think that real understanding implies almost indefinite retention, and lack of retention can be usually be explained by lack of understanding.

It's very easy to forget something if all you know about it is a memorized definition and an equation and two.

If you use statistical terms and concepts daily, you should be able to explain them after being woken up in the middle of your sleep, after several hours of partying with lots of booze. Anything less probably means you're acting things out rather than understanding them.

Feynman often talked about the issue of real understanding. One could summarize his view thusly: if you cannot explain it to a non-specialist of decent intelligence, you probably don't understand it.

## Re:What are you measuring? (Score:3, Insightful)