austinpoet writes in with a blog post debunking the theory we discussed a few days back that scientists' beer consumption is linearly correlated with the quality of their work. Chris Mack, Gentleman Scientist and beer drinker, has analyzed the paper and found it is severely flawed. From his analysis:

*"The discovered linear relationship between beer consumption and scientific output had a correlation coefficient (R-squared) of only about 0.5 — not very high by my standards, though I suspect many biologists would be happy to get one that high in their work... Thus, the entire study came down to only one conclusion: the five worst ornithologists in the Czech Republic drank a lot of beer."*
In order to find a correlation where the input IV (beer consumption) has an optimal value, you would have to do the regression on a transformation of the variable. Perhaps a quadratic would suffice, or else abs(X - k) for some unknown value of k.

If, on the other hand, he means the correlation coefficient r=.5, that means that R^2=.25. Still, a quarter of the variance in "work quality" is explained by beer drinking. That is still very high.

His point about outlying ornithologists and the points not being independent may still be valid; determining if they are is an empirical matter. Do these outlying scientists, in fact, socialize together? What other sources of nonindependence might there be, and do they affect THIS data set? Also should we really claim that 5 out of 34 (15% of the sample!) constitute OUTLIERS? Those aren't outliers, those are a subpopulation.

He didn't debunk the study; he rather raised some interesting questions.

