Mozilla Plan Seeks To Debug Scientific Code 115
ananyo writes "An offshoot of Mozilla is aiming to discover whether a review process could improve the quality of researcher-built software that is used in myriad fields today, ranging from ecology and biology to social science. In an experiment being run by the Mozilla Science Lab, software engineers have reviewed selected pieces of code from published papers in computational biology. The reviewers looked at snippets of code up to 200 lines long that were included in the papers and written in widely used programming languages, such as R, Python and Perl. The Mozilla engineers have discussed their findings with the papers’ authors, who can now choose what, if anything, to do with the markups — including whether to permit disclosure of the results. But some researchers say that having software reviewers looking over their shoulder might backfire. 'One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code,' says biostatistician Roger Peng at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. 'We need to get more code out there, not improve how it looks.'"
Send In The Codes (Score:0, Interesting)
Isn't it rich?
Are we a pair?
Me here at last on the ground,
You in mid-air.
Send in the codes.
Isn't it bliss?
Don't you approve?
One who keeps tearing around,
One who can't move.
Where are the codes?
Send in the codes.
Just when I'd stopped
Opening doors,
Finally knowing
The one that I wanted was yours,
Making my entrance again
With my usual flair,
Sure of my lines,
No one is there.
Don't you love farce?
My fault, I fear.
I thought that you'd want what I want -
Sorry, my dear.
And where are the codes?
Quick, send in the codes.
Don't bother, they're here.
Isn't it rich?
Isn't it queer?
Losing my timing this late
In my career?
And where are the codes?
There ought to be codes.
Well, maybe next year . . .
Provide a tool then BUTT OUT (Score:0, Interesting)
Yes Mozilla. BUTT OUT!!! Your coders are not scientists. Provide a code review tool like Findbugs and perhaps offer to assist pre-publication, but don't start spreading your "way of doing things" which puts off your own users. Scientists have enough to deal with
The Horror (Score:3, Interesting)
Re:Wrong objective. (Score:4, Interesting)
As a PhD student I am actively encouraged to reproduce results, mostly this has been possible but I know of at least one paper which has been withdrawn because my supervisor queried their results after we failed to reproduce them (I'll be charitable and say it was an honest mistake on their part).
I guess whether you are encouraged to check others work depends on your university and subject, but in certain areas it Does happen.
Egoless programming (Score:2, Interesting)
Back in the late 70s middle ages of comp sci...
There was this thing called "egoless programming" being taught. The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.
Yeah, it's a child of the 60s kind of thing, but it does work.
This is a huge challenge in the biomedical research field, because to be successful, you need personality traits like a strong ego (yes, *I* am brilliant, and my idea is the best, and you should fund it, and not that other bozo).
Re: Wrong objective. (Score:5, Interesting)
The Other Edge of the Sword (Score:5, Interesting)
Roger Peng's comment shows a typical, superficial understanding of programming. Ironically, he would be the first to condemn a computer scientist/coder who ventured in to biostatistics with a superficial knowledge of biology. I believe he would feel that anyone can program, but not anyone can do biostatistics. And I deeply disagree. Tools have been provided so that _any_ scientist can code. That does not mean that they understand coding or computer science.
I have personally experienced that especially in the softer sciences like biology, economy, meteorology, etc., the scientists have absolutely no desire to learn any computer science: coding methodology, testing, complexity, algorithms, etc. The result is kludgy, inefficient code heavily dependent on pre-packaged modules, that produces results that are often a guess; the code produces results but with a lack of any understanding of what the various packaged routines are doing or whether they are appropriate for the task. For example, someone using default settings on a principal component analysis package not understanding that the package expects the user to have pre-processed the data; the output looks fine but it is wrong. It is the same as someone approaching engineering without some understanding of thermodynamics and as a result wasting their time trying to construct a perpetual motion machine.