typodupeerror

## Science and the Shortcomings of Statistics429

Kilrah_il writes "The linked article provides a short summary of the problems scientists have with statistics. As an intern, I see it many times: Doctors do lots of research but don't have a clue when it comes to statistics — and in the social science area, it's even worse. From the article: 'Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.'"
This discussion has been archived. No new comments can be posted.

## Science and the Shortcomings of Statistics

• #### Lies, Damned Lies, and Statistics. (Score:5, Informative)

on Wednesday March 17, 2010 @09:31PM (#31518344)

In other news math may not lie but people still can, all the honesty and good statistics in the world doesnt help end-user stupidity, and there are statistically two popes per square kilometer in the vatican.

• #### Re: (Score:3, Informative)

Also, statistics are often manipulated to suggest correlations where there are none.
• #### Re: (Score:2, Insightful)

valid correlations are often manipulated to suggest causation where there is none.

in the end, it's only a problem if the person listening is an idiot...

• #### Re:Lies, Damned Lies, and Statistics. (Score:5, Funny)

on Wednesday March 17, 2010 @10:08PM (#31518596)
Exactly. I would never believe a statistic that I did not make up myself!
• #### Re:Lies, Damned Lies, and Statistics. (Score:4, Insightful)

on Thursday March 18, 2010 @01:35AM (#31519612) Homepage Journal

The problem is that a lot of people believe statistics produced by an expert such as a doctor. Sri Roy Meadow [meactionuk.org.uk] had people sent to prison, and lots of children taken away from their parents, by misinterpreting statistics.

• #### Re: (Score:3, Interesting)

I saw a fascinating presentation by an eminent professor of physics on what Meadow did wrong. It boiled down to mis-applying bayes' theorem. Meadow had got an extremely high probability of the accused being guilty out of it, what the professor did was poit out that (a) the probability put in for chance of two babies dying couldn't be taken by simply multiplying the chance of one dying by itself as the event may not be independent (b) that would be rather moot because the accused's chance of having 2 dead ba

• #### Re:Lies, Damned Lies, and Statistics. (Score:4, Funny)

on Wednesday March 17, 2010 @10:03PM (#31518566) Homepage

As with everything, xkcd delivers [xkcd.com]. My personal favorite :)

People often get caught assuming that Correlation == Causation.

• #### Re:Lies, Damned Lies, and Statistics. (Score:5, Funny)

on Thursday March 18, 2010 @05:55AM (#31520688)

Indeed. For example: 6 out of 7 dwarves aren't Happy.

• #### The problem is statisticians (Score:5, Insightful)

on Wednesday March 17, 2010 @10:01PM (#31518544)
In other news math may not lie but people still can...

Usually (in science at least) it's not even a matter of lying. Part of the problem is that the multi-headed monster that statistics has become has a tendency to lead people to over-use numerical "answers" vomited up by stats packages, without really understanding what they are for, or how to interpret them.

Statistics are very useful for predicting certain things, but all too often they are submitted as "proof" of a given condition, which is dangerous. Sometimes we need to throw away statistics and start applying common sense.
• #### Re:The problem is statisticians (Score:5, Interesting)

on Wednesday March 17, 2010 @10:35PM (#31518792)

Actually, one of the most dangerous uses of statistics is exactly predicting with them inappropriately. Curve fitting is especially prone to this error- attempting to make any predictions outside of the central mass of the points used to *produce* the curve is completely bogus, and yet people do it all the time.

• #### Re: (Score:3, Funny)

Here's a good example (credits to Nassim Taleb and his "The Black Swan" book) on the risks of extrapolation (of which curve fitting is one method):
- Based on previous experience, a turkey will confidently predict that he will wake up every morning be fed during the day and go to spleep in the evening. He can be easilly extrapolate this from the fact that it has happened every day of it's life. At some point before Christmas this turkey is going to have a big surprise ...

• #### Re:The problem is statisticians (Score:4, Funny)

on Thursday March 18, 2010 @01:38AM (#31519624) Homepage Journal

I feel somewhat vindicated for being no good at econometrics when I see where the people who were good at it have landed us.....

• #### Re: (Score:2)

While we're at it, stay away from hospitals! Most people in civilised countries die there rather than anywhere else!

• #### Re: (Score:3, Funny)

In other news, researches proved that water causes cancer. 100% of the cancer patients that died in 2009 drank water regularly.

• #### Re: (Score:3, Funny)

Not only that, but it is also the key ingredient in most of today's problems. It's the core element of acid rain, it's a main ingredient in beer and many other alcoholic beverages that cause families to break apart, you find it in fattening food and it is the main ingredient in all high carbon soft drinks.

Consuming that stuff might also lead to antisocial behaviour, as it has been confirmed that all murderers, gunmen and even terrorists have consumed it pretty much all their life. When are we going to ban t

• #### Summery? (Score:5, Funny)

on Wednesday March 17, 2010 @09:40PM (#31518394)

It's not just statistics that people have a problem with...

• #### Re: (Score:2)

I think it's meant to give you a nice, warm feeling just like on those hot summery days.
• #### Re:Summery? (Score:4, Funny)

on Wednesday March 17, 2010 @10:18PM (#31518672)

What's that law about spelling/grammar corrections inevitably having spelling or grammar mistakes in them?

• #### Re: (Score:3, Informative)

That would be Muphry's law [wikipedia.org].

For details on Muphry's law, click on the above hyperlink. For more fun laws, click on the below hyperlink.

More fun here. [wikipedia.org]

• #### Re:Summery? (Score:5, Informative)

on Thursday March 18, 2010 @02:37AM (#31519830)
And what's the law about spelling/grammar corrections that incorrectly correct the supposed spelling error? (Redundancy is purposefully deliberate.) "Its" is possessive. "It's" is a contraction of "it" and "is". -- This has been a message from your friendly neighborhood Spelling Nazi.
• #### Example: Standard Deviation (Score:5, Interesting)

on Wednesday March 17, 2010 @09:42PM (#31518402)
My doctor was explaining to me that my blood sugar readings should not have a standard deviation of more than 1/3rd of the average blood sugar reading. Just to test if he knew what it meant, I asked him what a standard deviation was. Oh the fun when he tried to bullshit his way out of that one! He eventually told me that when I plot my data in Excel I can ask it to give me statistics on the column and it would mention what the standard deviation value was. But when I pressed on and asked him what a standard deviation is, he shooed me off and told me to go look it up. Never did he confess that he had no clue.
• #### Re: (Score:2, Insightful)

As a statistics teacher (HS / Tech school level), this doesn't surprise me in the least. Statistics and statistics education has become a giant game of "plug the numbers in and damn the understanding". When a student has never calculated a standard deviation by hand, how can they be expected to know what the heck a root mean square deviation from the sample mean really is?

Going further, I would say that statistics is a tool for answering questions. Like any other tool, it works well for some jobs and not

• #### Re:Example: Standard Deviation (Score:5, Interesting)

on Wednesday March 17, 2010 @10:43PM (#31518848)

Doctors are notoriously bad with statistics. But the real kings of bad statistics are psychiatrists.

Notice how a LOT of studies in psychiatry are essentially statistics, statistics and a bit of statistics? It might be the reason why a lot of the courses you have to pass to become a shrink also consist of a lot of statistics, statistics... you get the idea.

NOBODY who decides that his course of studies would be psychiatry decided for that because he enjoy statistics that much, though. Actually, most psych students struggle badly with statistics. Psychiatry is one of the fields where the label doesn't match the contents. It looks like you're going to do a lot of messing with people's minds (aka "solving their psychology problems") but actually, judging from the courses, you become a refined statistician who had a bit of a counceling tutoring on the side.

That's not what people become shrinks for, though. They want to sit in their office, put people on their couch (or, more modern, in a comfy chair) and get 100 bucks an hour for listening to some idiot whine. And most do just that and will do fine.

It gets bizarre when they somehow end up in a spot where they have to rely on their statistics. Hey, you got a masters in that, and that entails a buttload of statistics, so you can do it... Nobody really cares that 9 out of 10 that somehow managed to get their diploma by either learning what they absolutely needed (and forgot it right after the test, certain that they'd never need it again, because ... ya know, listening to idiots and stuff, not sitting there plotting standard deviations...) or by cribbing altogether.

And then you get studies of the usefulness of psychotropic drugs and wonder whose black hole they pulled that out of...

• #### Re: (Score:3, Interesting)

And then you get studies of the usefulness of psychotropic drugs and wonder whose black hole they pulled that out of...

Indeed. Normally I would never cite an article in a McNews magazine like Time or Newsweek, but I found this explanation of the state of antidepressant drug efficacy to be one of the best I've run across so far - hundreds of billions of dollars all depending on some really, really bad math. Its like the collateralized debt securities of the drug & psychiatric industries:

http://www.newsweek.com/id/232781 [newsweek.com]

• #### Re: (Score:3, Insightful)

Except, if you had read this story, you would have found that the antidepressant = placebo story to be incorrect due to poor statistical reasoning:
"Another concern is the common strategy of combining results from many trials into a single “meta-analysis,” a study of studies. In a single trial with relatively few participants, statistical tests may not detect small but real and possibly important effects. In principle, combining smaller studies to create a larger sample would allow the tests to d

• #### Re:Example: Standard Deviation (Score:4, Informative)

on Thursday March 18, 2010 @12:05AM (#31519280)

You're mixing up psychiatrists, psychologists and psychotherapists.
A psychiatrist went to med school, got a doctors degree and specialized in problems with the brain. A psychologist went to university to learn the study of behavior of people. This involves a lot of statistics and many of them probably do consider it something they didn't go to college for, but it's a study that is supposed to follow the scientific method and prepare students for doing research, not therapy.

A psychotherapist is anyone who feels like calling themselves that. As a preparation they may have studied psychology at university, or they may have spent 20 years meditating in the Himalayas, or followed a short course at a religious group such as an institute of multiple personality disorder therapists or scientology.

• #### Re:Example: Standard Deviation (Score:4, Interesting)

on Wednesday March 17, 2010 @10:44PM (#31518852)
I agree with your concerns. Being a chemical engineer and a physical scientist, I have often found medical doctors understanding of chemistry and other sciences lacking. I once had an argument about chemical kinetics involved in a prescription drug I was taking, he basically told me I didn't know what I was talking about and blew me off. After another run in with him over another issue I fired him. But that's just one of my personal issues with a doctor.

Back when I was in graduate school me and my colleagues in graduate science taught pre-med chemistry and physics, which was a really watered down version of chemistry and physics which were taught to engineers and science majors. To be honest I thought it was kind of scary. All these years I was taught that medical student were supposed to be the best and the brightest, but we spoon fed them "baby chemistry" and "baby physics".

Since that time I have had many discussions with professors about this and they and I have come to the same conclusion, "the best and the brightest do not go into medical school". Thirty or forty years ago this may have been true, but economics has taken a turn and it just isn't the case anymore.

And why would they? They can make more money on Wall Street, they don't have to hassle with bureaucracy of health insurance, they don't have to hassle with lawyers, so why would the best and brightest go into medicine.

And you want to know what kind of income a hot little girl with a business degree can get. Pharmaceutical sales can pay 6 figures for one good figure. So the next time you see that good looking girl pulling that bag through your doctors office realize she is probably making a lot of money. More money than the average general practitioner .
• #### It's a tough situation (Score:2)

Actually, it's a tough situation. There is no real life experimental data can 100% fit the assumptions of commonly used statistical models. Real life data is messy. There is some degree of simplification. In addition, resorting to whiz-bang fancy methods that "fit" the real data may not be easily interpretable. Ease of result interpretability is what medical scientists want. There are other issues as well, such as computing time, equations derivability, etc.

In addition, many many medical scientists use stat

• #### Two weeks of six sigma classes... (Score:3, Funny)

on Wednesday March 17, 2010 @09:52PM (#31518476) Journal
Our company six sigma training included two weeks of collecting and analyzing data with a stats package. I got enough experience to even train me how to use the program. I can still do a few things that come up regularly. Probably the best thing to come out of six sigma (for me at least).
• #### Re: (Score:3, Funny)

So, you got twelve sigma-weeks of statistical training?
• #### Personal experience (Score:5, Interesting)

on Wednesday March 17, 2010 @09:53PM (#31518484)

As a doctor myself, I feel I should add my \$0.02...

Throughout med school we had the odd scattered lecture on statistics, and later when reading papers I used to skim over most of the maths just to look for the P value at the end (one representation of how statistically significant a result is).

However, I then took a formal stats course and was amazed at how little I understood - Monte Carlo techniques, Markov models, and even something as trivial yet important as the difference between a parametric versus a non-parametric test.

And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.

So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.

-Nano.

• #### Re: (Score:2)

And that, my friend, is why the NIH's constant push to produce more 'physician-researchers' continues to drive me nuts. Because they rarely insist K awards and other early-career training mechanisms require physicians intending to do research in areas where stats are important actually get any stats training..

• #### Re:Personal experience (Score:5, Insightful)

on Wednesday March 17, 2010 @10:24PM (#31518718)

...And then it struck me - most of the research I had read had applied parametric statistical tests to their data - that it, the researchers made an assumption that the underlying distribution of results would fall on a normal curve. Yet this simple assumption may be all it takes to skew the data when they should have chosen a non-parametric test instead.

So yes, stats are vitally important, badly taught, and focus too much on the maths rather than the concepts. Remember that we're doctors, not mathematicians - the last set of sums I did were in high school. If I need to analyse data, I'll probably plug it into SPSS - although now with my eyes open.

That's a good insight. I'm a statistics professor, and some of the problems I see are a) people generally get exposed to a single course in statistics; b) they're usually mathematically unprepared for it; c) so much gets squeezed into that one opportunity that heads are exploding; d) because of (a) - (c), everybody wants you to "just give 'em the formula"; e) since statistics is so widely used, there's a plethora of courses that are being taught by people who themselves are victims/products of (a) - (d), and are very happy to "just give 'em the formula"; and so e) most people plug and chug data through a stats package with no idea of the applicability, limitations, and interpretation of the results. The sheer volume of bad analyses is enough to make you weep, and contributes to the widely held perception about "lies, damned lies, and statistics". And that completely ignores the intentional falsehoods propagated by people who are trying to support various advocacy viewpoints, and will happily mislead the public with biased samples, Simpson's paradox [wikipedia.org], invalid assumptions, etc.

• #### Re:Personal experience (Score:4, Interesting)

on Wednesday March 17, 2010 @11:00PM (#31518960)
There's a certain level of historical baggage as well.

Even ten to fifteen years ago, students in Statistics courses had very little computer exposure, and that of course means any practical analyses would imply the use of approximations - hence the widespread use of chi squared tests and normal distributions for everything, whether appropriate or not.

If the statistician -> textbook -> student/scientist -> textbook -> scientist process is factored in, I have no doubt that it will take another generation or two before the old style of statistics is replaced sufficiently widely to be only a memory.

• #### Re: (Score:3, Informative)

It's the approach that you can just pump the numbers into SPSS or Statistica, and then call on a battery of tests until you get a "significant" result that results in the kind of errors the article (and a disturbing number of /. readers) fall into.

Unless you're dealing with large samples, all z and t tests assume normality in the population, with insignificant skew or kurtosis. Yet by definition, if we have enough data to be sure we have a normal population, we have enough data that the central limit theor

• #### Re: (Score:3, Interesting)

I think the term "Statistics" has become too general that people don't understand how complicated it can be. People think of Statistics as Bob saying to Alice - "Get me the stats on this weeks' sales." Alice just goes digging around and gets Bob the total sales in \$, # of units sold .... etc. People don't understand or know of the concepts that are involved in polling, they just thing they called 2,000 random people and that's it. That's statistics to the public and many college graduates.

I had to take a fe

• #### Pirates cause cool weather (Score:2)

Funny Stat correlation: http://www.seanbonner.com/blog/archives/piratesarecool.jpg [seanbonner.com]
• #### Re: (Score:2)

Careful, sir or madam, with that graph you are treading dangerously close to a theological argument here. Global Warming is the Flying Spaghetti Monster's way of telling us we need more pirates. If you want to know exactly how pirates and global warming correlate, please send money, and we will lease you an AVOM (Awesome Volt-Ohm Meter) with a blank face with which you can scare yourself until your midichlorians take over your reflexes.

"Luke Skywalker's a Jedi of course;

And he's prone to have much interco

• #### Fair and Balanced: Fox quotes the Bible as saying (Score:3, Funny)

on Wednesday March 17, 2010 @09:57PM (#31518524)

that there are only 3 kinds of scientists: those that are good at math and those that aren't.

• #### Excellent (Score:2, Insightful)

One of the best articles I've seen on stats (and their misuse). I'm taking a data analysis course at the moment and I've spent at least a dozen hours simply computing confidence intervals, testing the null hypothesis, and determining significance. It really has changed how I view statistics because it keeps pounding in these very key but oft-ignored principles.
• #### bad title (Score:5, Interesting)

on Wednesday March 17, 2010 @10:10PM (#31518608) Homepage Journal
It is not a shortcoming of statistics that other people, like various scientists who aren't statisticians, don't know how to use or properly interpret statistics. It is a shortcoming of their knowledge.

It is not a shortcoming of the Copenhagen interpretation of quantum mechanics or the Chicago school of economics if I don't understand or know how to correctly interpret their results. It is my shortcoming and fault for not knowing enough to connect the dots.

I do statistical research some of that is through interacting with researchers in the biosciences. Often when I go to talk to a researcher and ask them if they could use some statistical or mathematical or computational assistance with their research it has almost always been a fruitful starting point to long conversations and getting into the research. Now sometimes it was simply a matter of looking at their F-test results or ANOVA scores and telling them what it meant (like with a regression model relating proportions of certain characteristics between taxa), more useful interactions for me often mean working on new algorithms or estimators or working with fitting a model from their empirical data because there isn't a reliable standard model to work off of (like intergenic distance between genes in an operon) that kind of challenge makes less engaging work worth the hassle. Maybe I'm odd because I've worked hard to have a good background in both statistics and biology, but I shouldn't be.

Although here is an observation that perhaps supports some of the intent of the article from my own experience. I was speaking with a biology graduate student and it came up that they had a biostatistics course in the department. Of course as a statistician my mind goes towards survival function, failure rate, life tables, censored data, bioassy, epidemiology, microarrays, clincal trials, topics along those lines. It turned out their course focused z tests, t tests, f tests, confidence intervals, point predictions, least squares regression, multiple regression, ANOVA, and things along these lines just with simulated problems in a lab setting. That is not necessarily a bad thing, but much of the core math was under played or missing like model assumptions and alternate formulations or things like dummy variables. The worst part was that even though they were doing well with the class they had no confidence in actually using the statistics and didn't understand how to interpret the meaning of something like a confidence interval, they knew how to calculate one, but it wasn't clear what it actually meant to them.

The corollary to the notion in the summary I'd rant and claim is that scientists overall have less than desirable skills in mathematics, statistics, and computation than those who studied those disciplines principally and that's hurting science. However many in those three disciplines really know little beyond basic results in any of the sciences which hurts the applicability of these mathematical fields to the sciences and likely hurt our ability to develop certain types of discipline specific results that can be generalized from work in application problems.

In either case whether you're a typical scientist or a typical math/stat/comp person in order to become proficient enough in the other areas it requires going an awfully long out of the way compared to any counterpart who simply does not care and goes straight through as many before have. While in some areas of research on either side it is no problem to do as has been done and not further knowledge into those other areas. Increasingly results that have the highest levels of impact are coming more and more from truly interdisciplinary research. In order to further encourage that for those who are interested in such fields (aside from making more clear what areas in any of the fields fringe to such interdisciplinary work) we need more incentive to study more than one field and/or better ways of enabling fruitful cooperation between the camps.
• #### only in medicine (Score:5, Interesting)

on Wednesday March 17, 2010 @10:20PM (#31518692)
In reading a couple of these types of articles recently I've noticed that the articles always talk about this being a problem across all journals, but only seem to mention a couple of different disciplines - medicine usually chief among them. Has anyone heard/read anything naming a hard science (e.g. chemistry or physics) as full of bad stats? My hunch is that this happens most often in medicine because you have the combination of controlling for a lot of variables as well as inadequate mathematics training.
• #### Re:only in medicine (Score:4, Funny)

on Thursday March 18, 2010 @02:31AM (#31519820)

Physics (yes, Physics, THE hardest of hard sciences) is full of terrible mathematics, absolutely terrible, shockingly bad stuff. The good ones know it, some will say it doesn't matter because their butchery comes up with "accurate" results. If they can't even get their analysis right, what can we expect of the softer sciences? That said physics is not so much concerned with statistics as it is probability, none the less, they have some serious problems, for example they often simply decide highly non-convergent things should converge because the experiment says it should...

The greatest tragedy in modern science (in my eyes) is the loss of physics as a hard science, currently these guys are way off with the fairies and producing nothing of worth, string theorists are the worst. We'll see what the CERN guys manage to come up with, but right now the mathematicians have taken the ball and run with it. It has been said that physics has become too hard for the Physicists...

I am not trolling, I am quite serious about Physicists playing dodgey games with mathematics.

• #### Re: (Score:3, Interesting)

I've had my name included on several 'hard science' papers that had horrible statistical assumptions. I fought, and lost, because my professor had a big grant to maintain, and nobody else understood the underlying assumptions (we used an absolute scaling function, guaranteeing that our distribution was not normal, then tried to assume that it was normal). The second half of my thesis refutes the math in the last three papers I was on. Not one single person who read it understood it, which is sad because

• #### Bad outcomes due to statistics? (Score:2)

From TFA:

“There is increasing concern,” declared epidemiologist John Ioannidis in a highly cited 2005 paper in PLoS Medicine, “that in modern research, false findings may be the majority or even the vast majority of published research claims.”

One has to wonder, though: how much of that is due to misuse of statistics and how much is because it's paid research expected to get certain results in favour of those paying for the research?

• #### The problem is with statistics itself (Score:3, Informative)

on Wednesday March 17, 2010 @10:57PM (#31518946)
I see a lot of posts bashing people for being idiots, and I'm sure that's often the case, but IMHO there are some big problems with statistics itself.
• The most common school is the "classical" school, which is extremely counterintuitive. For instance, most people think that if a 95% confidence interval is 5 to 10, then the parameter has a 95% chance of being between 5 to 10. This would be true with Bayesian statistics, but exactly backwards for classical statistics. For classical statistics, it's that your 5 to 10 interval has a 95% chance of being around the parameter! This is a subtle difference that most statisticians don't even understand, and it screws up almost everyone. Furthermore the classical statement is much less useful than the intuitive statement that people think it is.
• Relatedly, other schools which make more sense such as Bayesianism and likelihoodism aren't taught. Furthemore, nonparametric statistics are usually not taught to undergrads (unless they are statistics majors probably). In the real world, non-parametric statistics are often more useful because no parametric model is actually true (for instance, basic regression assumes that the Truth is in your model, and it almost never is).
• Finally, a lot of statistics as it is normally taught depends on the central limit theorem. Any result that depends on the central limit theorem (or the law of large numbers) is often useless in real applications due to data poverty. The basic reason is that the average of i.i.d. random variables only converges to a normal distribution as 1/sqrt(n). Everyone knows this, and it's obvious that something that converges to 1/sqrt(n) is much much slower than the typical 1/n convergence, but people still rely on the central limit theorem.

Statistics is changing slowly (mostly because computers and R make non-classical statistics more practical) but the way it's taught still leads to problems.

• #### Looking for a good book on statistics (Score:4, Interesting)

on Thursday March 18, 2010 @01:37AM (#31519618) Homepage

I'm interested in learning the essentials of statistics. What would be a good book to start me out?

I got The Manga Guide to Statistics [nostarch.com] and it did introduce me to the very basics. However, there are many places where it just gives you an equation, without deriving it or even explaining it. After reading this book, I now know how to calculate standard deviation, but I'm still a bit vague on how people actually use it. I would like to see some examples of how people use statistics in (for example) science experiments.

My ideal book would explain the basics, with examples, and show how the math works. Ideally it wouldn't be a thousand pages long, either, but that's a secondary consideration.

steveha

• #### Re: (Score:3, Informative)

Devore's Probability and Statistics for Engineering and the Sciences is probably the best one-volume, undergrad-level intro to statistics out there. Get a copy (I think it's on the sixth or seventh edition now; you can pick up a fifth edition for cheap) and work your way through that, and you'll have a pretty good idea of where all those formulae come from and how they're used. Get a copy of R [r-project.org] and check out the "Devore*" packages in the package list [r-project.org] too. If you want to learn more after that, I recommend

• #### MY common conversation (Score:5, Informative)

on Thursday March 18, 2010 @09:29AM (#31522600) Homepage Journal

The largest demographic in american prisons are black americans. Real statistic but is it true?

Given a particular sample that indicates blacks are 60% of the prison population this would appear to be true.

But what if I said: "The largest demographic in prison is minority, non-whites." Suddenly the % jumps from 60% (black) to 80% (minority). Which is more right? This is the problem with statistics. Context.

Now I can say readily that the largest demographic in prison is actually right-handed people. The % now jumps to 90%.

But wait! There is more! The largest demographic is prison is actually people who prior to arrest were below the poverty line which jumps to 99% of the population. Again, all of the above are accurate based on a sample but which is MORE correct? Linear Algebra is coming into play here quickly....

When that kind of issue comes into play, it is the classic "Correlation != Causation" confusion. The majority of people in prison are in there because of "Being black? Being a minority? being right handed? or being poor?" None of the above. The majority of them are in there because they were convicted of a crime and sentenced. That is the causation of their imprisonment, the rest is correlation which may have a direct causation on the conviction or sentencing, but no direct causation on being in prison. (e.g. You cannot be thrown into prison for being poor, black, minority, right handed)

Same with medical research, politics, economics, etc. The price of oil rising 10% and a subsequent 5% drop in shipping orders. Measuring the significance of regessors is important but oddly never reported most of the time. Many factors get masked or shadowed by higher level regressors (e.g. being a minority masks a variety of other social and economic factors. In addition it can distort statistical work by being too broad. Asians have a variety of different economic and social factors as north american blacks versus even african immigrants.)

Back to the orignal subject:

We can take 100 prisoners and 100 non-prisoners and figure out rather quickly if being black is statistically significant in prison population. Non-prison population blacks would account for 25%-45% of the population (Depending on location). We can see that 60% of prisoners are black. There is a 20+% deviation from the norm. We can test to see the significance of that. Same with minorities. Now we find something quickly that right handed is insignificant because it doesn't deviate from the norm. We can test left-handed and right-handed populations and rule out the handed-ness of a convict being significant.
We can find the economic status is considerable MORE significant then minority or black as a status. We can determine that the reason minorities or blacks are disporotinally more prevelant in prison is that blacks and minorities have higher rates of poverty. We can extract and determine the statistical weight of POVERTY in regards to imprisonment (Since we find a high % of white in prison that are poor compared to the normal population.) Once we figure that out we can remove that and continue an investigation and figure out what weight minority and black has once we have removed POVERTY from the model (Residual analysis).

The problem in reporting is without providing the whole, comprehensive analysis you can miss important things. For instance to correct the injustice in sentencing, without reporting the weight POVERTY has in contrast to BLACK or MINORITY you may lose sight that you may have better success addressing POVERTY to normalize sentencing rather then MINORITY or BLACK (or not).

The same happens in medical reasearch. Given a cocktail of drugs wirthout having the whole analysis you may end up providing more of Medicine A versus B but lose sight that A & B are limited by the dosage of Medicine C.

Satistics are not bullshit, rather mearly observations with no intrinsic agenda or even implication of truth. Purely amoral, like a hand gun.. useful to both the good and evil.

Statistics don't lie, nor do they tell the truth. They simple show the relationship of the data as it stands. The Truth or Thruthiness of it is subjective and vulnerable to context.

"Why should we subsidize intellectual curiosity?" -Ronald Reagan

Working...