Forgot your password?
typodupeerror
Mozilla Programming Science

Mozilla Plan Seeks To Debug Scientific Code 115

Posted by Soulskill
from the unit-tests-are-for-undergrads dept.
ananyo writes "An offshoot of Mozilla is aiming to discover whether a review process could improve the quality of researcher-built software that is used in myriad fields today, ranging from ecology and biology to social science. In an experiment being run by the Mozilla Science Lab, software engineers have reviewed selected pieces of code from published papers in computational biology. The reviewers looked at snippets of code up to 200 lines long that were included in the papers and written in widely used programming languages, such as R, Python and Perl. The Mozilla engineers have discussed their findings with the papers’ authors, who can now choose what, if anything, to do with the markups — including whether to permit disclosure of the results. But some researchers say that having software reviewers looking over their shoulder might backfire. 'One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code,' says biostatistician Roger Peng at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. 'We need to get more code out there, not improve how it looks.'"
This discussion has been archived. No new comments can be posted.

Mozilla Plan Seeks To Debug Scientific Code

Comments Filter:
  • Send In The Codes (Score:0, Interesting)

    by Anonymous Coward on Wednesday September 25, 2013 @12:22AM (#44944645)

    Isn't it rich?
    Are we a pair?
    Me here at last on the ground,
    You in mid-air.
    Send in the codes.

    Isn't it bliss?
    Don't you approve?
    One who keeps tearing around,
    One who can't move.
    Where are the codes?
    Send in the codes.

    Just when I'd stopped
    Opening doors,
    Finally knowing
    The one that I wanted was yours,
    Making my entrance again
    With my usual flair,
    Sure of my lines,
    No one is there.

    Don't you love farce?
    My fault, I fear.
    I thought that you'd want what I want -
    Sorry, my dear.
    And where are the codes?
    Quick, send in the codes.
    Don't bother, they're here.

    Isn't it rich?
    Isn't it queer?
    Losing my timing this late
    In my career?
    And where are the codes?
    There ought to be codes.
    Well, maybe next year . . .

  • by Anonymous Coward on Wednesday September 25, 2013 @12:26AM (#44944663)

    Yes Mozilla. BUTT OUT!!! Your coders are not scientists. Provide a code review tool like Findbugs and perhaps offer to assist pre-publication, but don't start spreading your "way of doing things" which puts off your own users. Scientists have enough to deal with

  • The Horror (Score:3, Interesting)

    by Vegemite (609048) on Wednesday September 25, 2013 @01:26AM (#44944911) Homepage
    You must be joking. Many scientific papers out there have results based on prototype or proof of concept software written by naive grad students for their advisors. These are largely uncommented hacks with little, if any, sanity checks. To sell these prototypes commercially, I have had to cleanup after some of these grads. I take great sadistic pleasure in throwing out two years of effort and rewriting it all from scratch in a couple of weeks.
  • Re:Wrong objective. (Score:4, Interesting)

    by Anonymous Coward on Wednesday September 25, 2013 @04:42AM (#44945553)

    As a PhD student I am actively encouraged to reproduce results, mostly this has been possible but I know of at least one paper which has been withdrawn because my supervisor queried their results after we failed to reproduce them (I'll be charitable and say it was an honest mistake on their part).

    I guess whether you are encouraged to check others work depends on your university and subject, but in certain areas it Does happen.

  • Egoless programming (Score:2, Interesting)

    by Anonymous Coward on Wednesday September 25, 2013 @06:11AM (#44945947)

    Back in the late 70s middle ages of comp sci...
    There was this thing called "egoless programming" being taught. The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.

    Yeah, it's a child of the 60s kind of thing, but it does work.

    This is a huge challenge in the biomedical research field, because to be successful, you need personality traits like a strong ego (yes, *I* am brilliant, and my idea is the best, and you should fund it, and not that other bozo).

  • Re: Wrong objective. (Score:5, Interesting)

    by old man moss (863461) on Wednesday September 25, 2013 @06:30AM (#44946061) Homepage
    Yes, totally agree. As someone who has tried to reproduce other people's results (in the field of image processing) with mixed success. It can be incredibly time consuming trying to compare techniques which appear to be described accurately in journals, but omit "minor" details of implementation which actually turn out to be critical. I have also had results of my own which seemed odd and were ultimately due to coding errors which inadvertently improved the result. Given the opportunity, I would have published all my academic code.
  • by fygment (444210) on Wednesday September 25, 2013 @09:35AM (#44947333)

    Roger Peng's comment shows a typical, superficial understanding of programming. Ironically, he would be the first to condemn a computer scientist/coder who ventured in to biostatistics with a superficial knowledge of biology. I believe he would feel that anyone can program, but not anyone can do biostatistics. And I deeply disagree. Tools have been provided so that _any_ scientist can code. That does not mean that they understand coding or computer science.

    I have personally experienced that especially in the softer sciences like biology, economy, meteorology, etc., the scientists have absolutely no desire to learn any computer science: coding methodology, testing, complexity, algorithms, etc. The result is kludgy, inefficient code heavily dependent on pre-packaged modules, that produces results that are often a guess; the code produces results but with a lack of any understanding of what the various packaged routines are doing or whether they are appropriate for the task. For example, someone using default settings on a principal component analysis package not understanding that the package expects the user to have pre-processed the data; the output looks fine but it is wrong. It is the same as someone approaching engineering without some understanding of thermodynamics and as a result wasting their time trying to construct a perpetual motion machine.

"A great many people think they are thinking when they are merely rearranging their prejudices." -- William James

Working...