Mozilla Plan Seeks To Debug Scientific Code 115
ananyo writes "An offshoot of Mozilla is aiming to discover whether a review process could improve the quality of researcher-built software that is used in myriad fields today, ranging from ecology and biology to social science. In an experiment being run by the Mozilla Science Lab, software engineers have reviewed selected pieces of code from published papers in computational biology. The reviewers looked at snippets of code up to 200 lines long that were included in the papers and written in widely used programming languages, such as R, Python and Perl. The Mozilla engineers have discussed their findings with the papers’ authors, who can now choose what, if anything, to do with the markups — including whether to permit disclosure of the results. But some researchers say that having software reviewers looking over their shoulder might backfire. 'One worry I have is that, with reviews like this, scientists will be even more discouraged from publishing their code,' says biostatistician Roger Peng at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. 'We need to get more code out there, not improve how it looks.'"
Wrong objective. (Score:5, Insightful)
I don't know the actual objective ... but if the concern is "'We need to get more code out there, not improve how it looks.'" ... the objective is bad.
Wouldn't shouldn't this be about catching subtle logic / calculation flaws that lead to incorrect conclusions?
Agree ... if this is about indenting and which method of commenting ... then yeah ... bad idea.
But this has the possibility of being so much more. I would see it as free editing by qualified people. Seems like a deal.
Re: (Score:2)
Exactly. If the code they are writing looks like bad PHP from 10 years ago then it needs to be exposed.
What is needed is more *good quality* code being published.
Re:Wrong objective. (Score:5, Insightful)
I think that's exactly the opposite of the point the GP was trying to make.
If it looks like bad PHP from 10 years ago but contains no bugs, then that is completely okay.
If it looks like old COBOL strung together with GO TO's and it works, it's okay.
If it looks like perfect C++ code but contains bugs, the bugs needs to be exposed, especially so if the research results are based on the output of the code.
Re: (Score:3, Informative)
Re: (Score:2)
All of which are great if code is to be maintained, which this type of code rarely is.
None of which affects whether the code actually works.
Re: (Score:1)
Re: (Score:3)
All of which are great if code is to be maintained, which this type of code rarely is.
Not always true, probably not by a long shot. I'm maintaining code written over a span of time beginning in the 1980's (not by me) and last updated yesterday (and again as soon as I'm done here...). Some written very well, some quite the opposite. Not often is scientific code used for just one project, if it's of any significant utility.
Re: (Score:3)
All of which are great if code is to be maintained, which this type of code rarely is.
Or if it is re-used, which is one of the potential benefits of publishing it alongside the paper.
Also, since the purpose of research papers is to transmit ideas, clear, readable code serves readers much better than functional but opaque code... and that assumes the code is actually functional. Ugly code tends to be buggier, precisely because it's harder to understand.
Re:Wrong objective. (Score:4, Insightful)
If it looks like bad PHP from 10 years ago but contains no bugs, then that is completely okay.
If it looks like old COBOL strung together with GO TO's and it works, it's okay.
If it looks like perfect C++ code but contains bugs, the bugs needs to be exposed, especially so if the research results are based on the output of the code.
None of the above. It's scientific code. It looks like bad Fortran (or even worse, FORTRAN) from 20 years ago, which is ok, since Fortran 90 is fine for number crunching.
In all seriousness, my experience is that "Ph.D. types" (for want of a better term) write some of the most amateurish code I've ever seen. I've worked with people whose knowledge and ability I can only envy, and who are anything but ivory tower types, but write code like it was BASIC from a kindergartener (ok, today's kindergarteners probably write better code than in my day). Silly things like magic numbers instead of properly defined constants (and used in multiple places no less!), cut-and-paste instead of creating functions, hideous control structures for even simple things. Ironically, this is despite the fact that number crunching code generally has a simple code structure and simple data structures. I think bad code is part of the culture or something. The downside is that it makes it more likely to have bugs, and very difficult to modify.
Realistically, this is because they're judged on their results and not their code. To many people here, the code is the end product, but to others it's a means to an end. Better scrutiny of it though would lead to more reliable results. It should be mandatory to release the entire program within, say, 1 year of publication. As for it being obfuscated, intentionally or otherwise, I don't think there's much you can do about that.
Re:Wrong objective. (Score:4, Informative)
The problem is most papers do not publish the code, only the results. This causes dozens of problems: if you want to run their code on a different instance you can't, if you want to run it on different hardware you can't, if you want to compare it with yours you only sort of can since you have to either reimplement their code or run yours on a different environment than theirs, which makes comparisons difficult. Oh, and it makes verifying the results even more worse, but it isn't like many people try to verify anything.
On the one hand catching bugs can help find a conclusion was wrong sooner than it would happen otherwise. On the other hand it may make it less likely that authors will put their code out there. Anyhow, I think it's a good idea and worth a shot. Who knows, maybe it'll end up helping a lot.
Re: Wrong objective. (Score:4, Insightful)
Well running The ORIGINAL author's code isn't that important.
What's important is the analysis that the code was supposed to do.
Describing that in mathematical terms and letting anyone trying to replicate the research is better than handing the original code forward. That's just passing another potential source of error forward.
Most of the (few) research projects I been called to help with coding on are strictly package runners. Only a one had anything approaching custom software, and it was a mess.
Re: Wrong objective. (Score:5, Insightful)
I have to disagree. Before I go to a heap of effort reproducing your experiment, I want to check that the analysis you ran was the one you described in your paper. After I've convinced myself that you haven't made a mistake here, I may then go and try your experiment on new data, hopefully thereby confirming or invalidating your claims. Indeed, by giving me access to your code you can't then claim that I have misunderstood you if I do obtain an invalidating result.
Re: Wrong objective. (Score:5, Interesting)
Re: (Score:2)
You had the opportunity. You could have put your code and notes on how to use it and the appendix to your papers.
Re: (Score:1)
I agree. Code is math, and thus of the experiment and analysis, and is not just an interpretation. "Duplicate it yourself" stands against the very idea of review and reproduction.
While there is tremendous utility in an independent reconstruction of an algorithm (I have numerous times built a separate chunk of code to calculate something in a completely different way, to test against the real algorithm/code, in practice they debug each other) the actual code needs to be there for review.
They may have a des
Re:Wrong objective. (Score:5, Insightful)
Yeah, it seems like the real objective should be to get more code read and verified as part of the scientific process. (Just "getting more code out there" and expecting it to go unread would be pretty empty.)
One problem is that the publish-or-perish process has gotten sufficiently corrupt that many results are irreproducible, PhD students are warned against trying to reproduce results, and everyone involved has lost the expectation that their work will be experimentally double-checked.
Re:Wrong objective. (Score:4, Interesting)
As a PhD student I am actively encouraged to reproduce results, mostly this has been possible but I know of at least one paper which has been withdrawn because my supervisor queried their results after we failed to reproduce them (I'll be charitable and say it was an honest mistake on their part).
I guess whether you are encouraged to check others work depends on your university and subject, but in certain areas it Does happen.
Looking over the shoulder (Score:3)
Re: (Score:3)
Re: (Score:2)
In all fairness that's an easy mistake to make, because ^ means exponentiation in other languages. It's an historical stupidity, like the fact that log() is the natural log, not log10().
Re: (Score:2)
Frequently. It's not supposed to be their main area of expertise and they often learn just enough to solve their immediate problem. And why should they learn more? So occasionally they make blunders like that, but a professional computer programmer wouldn't know what problem to code or what analysis needs to be done in the first place. That's what the scientists are good at.
Re: (Score:3)
Scientists and researchers generally write lousy code. If you think TheDailyWTF is bad, you haven't seen researcher code.
Generally write-only, lots of copy-pasta going on, variables that *might* make sense (and probably declared globally) and if you're
Re: (Score:2)
Ph.D. dissertations require original research. However, assigned classwork for Doctor's and Master's students would be improved if it involved replication and re-analysis of recent research in the field to study methods of data collection and analysis. This would make replication and reexamination of recent research a routine part of academia. The benefits for the students would be seeing how other researchers do their work and practice at methods of analysis and occasionally the satisfaction of showing
Re: (Score:2)
That is such a great idea. Wish it would happen.
Re: (Score:2)
Re: (Score:2)
I don't know the actual objective ... but if the concern is "'We need to get more code out there, not improve how it looks.'" ... the objective is bad.
Wouldn't shouldn't this be about catching subtle logic / calculation flaws that lead to incorrect conclusions?
Agree ... if this is about indenting and which method of commenting ... then yeah ... bad idea.
But this has the possibility of being so much more. I would see it as free editing by qualified people. Seems like a deal.
That's one of two worthy objectives. The other is to make the code more suitable for use by other researchers.
Re: (Score:2)
Re: (Score:2)
Yes Mozilla. BUTT OUT!!! Your coders are not scientists. ... Scientists have enough to deal with
Scientists have enough to deal with ... like buggy code? RTFA. It causes real problems, and I have no use for the "we're specialists, you couldn't possibly help us" attitude (often it's espoused to hide problems).
Would you trust a chemist who didn't know the proper practices for working in a chem lab? If not, why should you trust someone doing computational chemistry problems who doesn't know how to code? It's too easy to fall for the "how hard could this be" syndrome. For example, the time Richard Feynman
Hell Yes! (Score:5, Insightful)
Re:Hell Yes! (Score:5, Insightful)
Problem is, at least in this trial they're reviewing already published code, when it's too late to gain much benefit from the review on the part of the original writer. A research project is normally time-limited after all; by the time the paper and data is public, the project is often done and people have moved on.
There's nobody with the time or inclination to, for instance, create and release a new improved version of the code at that point. And unless there's errors which lead to truly significant changes in the analysis, nobody would be willing to publish any kind of amended analysis either.
Re: (Score:1)
There is a reason that models have to be validated. If you choose validation cases well, a code that passes them will almost certainly be a good model. Beyond that, you do the best you really can, and that's that.
Otherwise, here, I've got 40k lines of code here, anyone want to check it over for me? This is free of charge, right?
Re: (Score:2)
Where do I sign up? If I could get a "code reviewed by third party" stamp on my papers, I'd feel a lot better about publishing the code and the results derived from it.
Believe it or not, some computer science programming language conferences are doing *just that*.
http://cs.brown.edu/~sk/Memos/Conference-Artifact-Evaluation/ [brown.edu]
http://ecoop13-aec.cs.brown.edu/ [brown.edu]
http://splashcon.org/2013/cfp/665 [splashcon.org]
What is Mozilla? (Score:2)
Re: (Score:1)
A tiddlywinks ballroom, two vending machines and a build-a-squirrel online project. Apparently they have made some attempt at an internet browser too.
they do Seamonkey, a better browser than Firefox (Score:2)
What else do they do, you ask? They support Seamonkey, Firefox's older brother. Firefox began as a stripped down,lightweight, minimalist version of Seamonkey. Though Firefox is no longer lightweight, Seamonkey is still more capable in some respects. The suite includes an email client and WYSIWYG editor, but I just like the browser.
While Firefox is controlled by the Mozilla Foundation, Seamonkey is community driven now, with hosting and other support from the foundation.
Not technical tho (Score:2)
any review may find off-by-one, etc. (Score:3)
Having ANY second programmer look at the code may well find off-by-one or fence post errors and the like.
Re: (Score:2)
Mozilla is a bit like Apache, its a broad tent of vaguelly related projects , its not just firefox.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
they fracking suck at writing clean bug free code, and suck just as much at reviewing it
Then how come the browser I'm using right now works pretty well?
Software architecture (Score:1)
The overall structure of most the code in HEP [1] is nasty. It's too late for the likes of ROOT [2]: input of software engineers at the early stages of code design could be very useful.
1. https://en.wikipedia.org/wiki/Particle_physics
2. https://en.wikipedia.org/wiki/Root.cern
Mozilla needs to improve their own code. (Score:1, Flamebait)
Mozilla barely has control of their own code base. The number of open bugs keeps increasing. Attempts to multi-thread the browser failed. The frantic release schedule results in things like the broken Firefox 23, where panels in add-ons just disappeared off screen. They have legacy code back to Netscape 1, and it's crushing them. Firefox market share is declining steadily. Not good.
Crappy tax dodgers review science papers now? (Score:1)
See subject line. I don't know what the hell qualifies Mozilla to review scientific code. For one thing, scientific code in academic papers is proof-of-concept - it's designed to show how to implement something according to the description in the paper, not engineered for general deployment.
The bla bla need more people counterargument is bollocks, however - there are enough people in computational biology doing utterly pointless things.
Perhaps Mozilla's looking for another way to justify its on-going tax av
Don't forget spreadsheets (Score:5, Informative)
As we've seen recently, bad decisions can be made from errors in spreadsheets. We need these published so they can be double-checked as well.
Re: (Score:1)
They should be publishing their code because the basic precept behind peer reviewed publishing is that results could be reproduced. Most of the time they are not but computational scientists need to be constantly reminded that they are performing experiments, not publishing the code is exactly the same as a synthetic chemist not including an experimental section (the procedure for the synthesis).
Re: (Score:2)
If I made some approximation or used an algorithm that may fall apart in some limits, that is worth mentioning.
Uh, huh. And what if you don't realize that your code has subtle failings that may have significantly altered your results? Anyone trying to reproduce your results but doing it right will fail, but be unable to explain why their results differed. Without your code peer review of your work is both harder and less valuable.
Unless deterring review is the researcher's intent, of course.
Re: (Score:2)
I can write that I convolved two functions, but you don't need to see the code that I used to do the convolution.
So you used a standard library for doing the convolution, cited that library correctly, and showed how you called the library? That would be very good academic programming and paper-writing too. Of course, the flip side also holds: if you don't show your methods properly, or don't cite others work that you use or reference, you're a bad academic. If you do it all yourself when much of it isn't your research focus, you're just wasting your time (and encouraging others to ignore you).
Re: (Score:3)
Oh, If only you knew...
Well, I wouldn't go so far as publishing my findings, but now I always double-check spread sheets when I'm not sure if it is or isn't a ladyboy.
Re: (Score:2)
As we've seen recently, bad decisions can be made from errors in spreadsheets.
For that problem, let's just get rid of spreadsheets (at least as they're implemented in most programs). Copy-and-paste is the standard way to do the same computation in several places. How much further could you get from good practice? Reviewing the "code" requires peering at every cell. Etc., etc,. etc. Lastly, the people who use them are often idiots who have no idea what they're doing. At least if you made them use a programming language, they'd never get it to run. That way they couldn't pretend that t
Hypocritical (Score:1)
Mozilla better work on de-bloating its own code first.
Get used to it (Score:2)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Correction: a scientist that doesn't want to improve source quality isn't a scientist.
Some can argue that they don't have time or budget to do so, but flat out not wanting to is a failure of the process itself. Its not someone you want to trust to make predictions on data.
Oh dear (Score:2)
Though back in the day I did make one guys code a bit more user friendly (his origioal comment was I dont need any prompts to remind me what i need to type ) as we had scaled to 1:1 models and as one single run of the rig could cost £20k in materiel's.
The Horror (Score:3, Interesting)
Re: (Score:2)
I have had to cleanup after some of these grads. I take great sadistic pleasure in throwing out two years of effort and rewriting it all from scratch in a couple of weeks.
Of course it's a lot easier and quicker to re-write someone's code when you already know what you're aiming at.
If we knew what we were doing... (Score:3)
Too heavy mozilla drives mac users to chrome (Score:1)
I agree, but FYI: (Score:2)
MAC [wikipedia.org] (all-caps) - Machine Access Code, a hexadecmial address used to identify individual pieces hardware on a network
Mac [wikipedia.org] - marketing name for the longstanding "Macintosh" line of computers by Apple
I've used Firefox since it first came out, but it's so damned bloated with unneeded 'extras' that I only stick with it because it's the one browser that allows extensions like AdBlock Plus to block outgoing server requests, not just hide the results. I had defected over to Opera for several months, but when they d
Re: (Score:2)
FWIW Safari allows extensions to block the requests before they're made as well, although the exact mechanism may be different.
Re: (Score:2)
Absolutely necessary ... (Score:1)
Most of my collegues at the university are terrible coders and I am often even not sure how much I trust their results. Even if it does scare people, there has to be more awareness about code review in the scientific field than there is today.
Good intentions, bad implementation (Score:2)
My experience was a real eye opener. Between the buffer overruns, and logic holes, I am amazed the crap ran at all. The fact that it compiled was a bit of a mystery until I realized that it was possible to ignore compile errors.
Re: (Score:2, Insightful)
This is a logical fallacy that many 'smart' people fall into. I am smart (in this case usually PhD's or people on their way to it) so this XYZ thing should be no sweat. They seem to forget that they spent 10-15 years becoming very good at whatever they do. Becoming a master of it. Yet somehow they also believe they can use this mastery on other things. In some very narrow cases you can do this. But many times you can not. Or even worse assuming no one else can understand what you are doing or they wi
Egoless programming (Score:2, Interesting)
Back in the late 70s middle ages of comp sci...
There was this thing called "egoless programming" being taught. The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.
Yeah, it's a child of the 60s kind of thing, but it does work.
This is a huge challenge in the biomedical research field, because to be successful, you need personal
Re: (Score:2)
Re: (Score:2)
The idea being that we have to inculcate in developers the idea that your code is not necessarily a reflection of your personal worth, and that it deserves to be poked at and prodded, and that you should not take personal offense by it.
Wusses and namby-pambies. I take the opposite approach. Three or more bugs found in your code results in summary execution, with your corpse hung from the flagpole as a reminder to others.
We need more code out there (Score:2)
Science is so passe! (Score:1)
Faith is where it's at! Looking at "science" journals is like looking at internet pron- it's a one way ticket to H-E-double hockeysticks! You need some proper churchin'!
researcher vs. software developer (Score:5, Informative)
People doing scientific research and software developers are really doing very different things when they write code. For software developers or software engineers, the code is the end goal. They are building a product that they are going to give to others. It should be intuitive to use, robust, produce clear error messages, and be free of bugs and crashes. The code is the product. For someone doing scientific or engineering research, the end goal is the testing an idea, or running an experiment. The code is a means to an end, not the end itself; it needs only to support the researcher, it only needs to run once, and it only needs to be bug free in the cases that are being explored. The product is a graph or chart or sentence describing the results that is put into a paper that gets published; the code itself is just a tool.
When I got my Ph.D. in the 1990s, I didn't understand this, and it brought be a lot of grief when I went to a research lab and interacted with software developers and managers, who didn't understand this either. The grief comes about because of the different approaches used during the development of each type of code. Software developers describe their process variously as a waterfall model, agile development model, etc.. These processes describe a roadmap, with milestones, and a set of activities that visualize the project at its end, and lead towards robust software development. The process a researcher uses is related to the scientific method: based on the question, they formulate a hypothesis, create an experiment, test it, observe the results, and then ask more questions. They do not always know how things will turn out, and they build their path as they go along. Very often, the equivalent "roadmap" in a researchers mind is incomplete and is developed during the process, because this is part of what is being explored.
In my organization, this makes tremendous conflict between software developers, who want a careful, process driven model to produce robust code, and researchers, who are seeking to answer more basic questions and explore unknown territory in a way that has a great deal of uncertainty and cannot always easily deliver specific milestones and clarity into schedule that is often desired.
It is worse when the research results in a useful algorithm; of course, the researcher often wants to make it available to the world so that others can use it. This is more of a grey area; if the researcher knows how to do software engineering, they may go through the process to create a more robust product, but this takes effort and time. The fact that Mozilla wants to help debug scientific code is a very good thing; it often needs more serious debugging and re-architecting than other software that is openly available.
I wish more people understood this difference.
The Other Edge of the Sword (Score:5, Interesting)
Roger Peng's comment shows a typical, superficial understanding of programming. Ironically, he would be the first to condemn a computer scientist/coder who ventured in to biostatistics with a superficial knowledge of biology. I believe he would feel that anyone can program, but not anyone can do biostatistics. And I deeply disagree. Tools have been provided so that _any_ scientist can code. That does not mean that they understand coding or computer science.
I have personally experienced that especially in the softer sciences like biology, economy, meteorology, etc., the scientists have absolutely no desire to learn any computer science: coding methodology, testing, complexity, algorithms, etc. The result is kludgy, inefficient code heavily dependent on pre-packaged modules, that produces results that are often a guess; the code produces results but with a lack of any understanding of what the various packaged routines are doing or whether they are appropriate for the task. For example, someone using default settings on a principal component analysis package not understanding that the package expects the user to have pre-processed the data; the output looks fine but it is wrong. It is the same as someone approaching engineering without some understanding of thermodynamics and as a result wasting their time trying to construct a perpetual motion machine.
Re:The Other Edge of the Sword (Score:4, Informative)
For example, someone using default settings on a principal component analysis package not understanding that the package expects the user to have pre-processed the data; the output looks fine but it is wrong.
I'm a biologist who learned enough computational stats to get by and I do see what you mean. Initially I did do stuff like that, but over time I put in the effort to learn what's going on and now I hope I make these sorts of dumb mistakes a lot less often! However this is not so much a coding problem, but a stats problem. People in the "soft sciences" don't just have problems with more advanced stuff such as PCA, ICA, clustering, etc, but even simple stats. For example, it's very common to see ANOVA performed on data that would be much better suited to regression analysis. The concept of fitting a line or curve and extracting meaning from the coefficients is rather foreign to a lot of biologists, who are more comfortable with a table full of p-values. Indeed, there is a general fixation on p-values, despite the fact that these are not well understood. There is a tendency to hide raw data (since biological data are often noisy). There is also a tendency to use analyses such as PCA or hierarchical clustering simply to produce fancy plots to blind reviewers; these plots often add no insight (or the insight they might add is not explored).
Re: (Score:1)
I've seen personally (and "Dilbert" would seem to confirm as universal) the generalized business belief that . . .
"programming is easy."
"quality is easy."
"expand-ability is easy."
"maintainability is easy."
"If I just had a Project Management tool to keep a death grip on delivery time . . . all those other "easy" things will just naturally fall into place."
I keep thinking the opposite . .
Quality
Been doing this for 15 years (Score:1)
For the brother-in-law, MD/PhD at local school - he sits on several review boards.
The biggie is not the code, but the data set. Like to design data sets to test code rather than do code reviews.
Have also done some code reviews when the b-in-law was not certain. And have found 'bogus' code twice.
Another (anecdotal) point - all problems found were with life science students. NONE/ZERO/NADA problems with code done by physical sciences or engineering people. Unless you want to count some of the most ugly Python
Babel (Score:2)
On a related note, the Babel project is getting pushed for Reproducible Research http://orgmode.org/worg/org-contrib/babel/intro.html [orgmode.org]
It allows code to be embedded in other documents, eg. the LaTeX source of a paper, and executed during rendering.
Also the Recomputation project is trying to archive scientific code, complete with virtual machines set up to run them http://www.recomputation.org/ [recomputation.org]
Do research, don't write code (Score:2)
Researchers are good at researching. They can write some code though.
Programmers are good at programming. They know how to write good code that is easy to maintain and adapt.
If you're a researcher with some experience in writing code, you should ask you self, "should I spend that much time writing code, while a programmer does a better job in less time while it has also less bugs, will be reviewed and has unit tests"? Also, how much do you know about design patterns? Sure. Your code works without. Good luck
Re: (Score:2)
If you're a researcher with some experience in writing code, you should ask you self, "should I spend that much time writing code, while a programmer does a better job in less time while it has also less bugs, will be reviewed and has unit tests"? Also, how much do you know about design patterns? Sure. Your code works without. Good luck with it. Also good luck with the headache in one year.
It usually doesn't work like that. The researcher does the experiments then analyses and interprets the data. If the latter process requires coding then the researcher does the coding. If a researcher gives up the coding to a programmer (who may have a bad understanding of the science) then they have lost ownership of their data. Besides, there's usually no money to pay a programmer. The only situation where a programmer is called for is in a big lab which needs one or more significant software projects cr
It's Damn Fine Idea (Score:2)