Open Source Experiment Management Software? 122
Alea asks: "I do a lot of empirical computer science, running new algorithms on hundreds of datasets, trying many combinations of parameters, and with several versions of many pieces of software. Keeping track of these experiments is turning into a nightmare and I spend an unreasonable amount of time writing code to smooth the way. Rather than investing this effort over and over again, I have been toying with writing a framework to manage everything, but don't want to reinvent the wheel. I can find commercial solutions (often specific to a particular domain) but does anyone know of an open source effort? Failing that, does anyone have any thoughts on such a beast?"
"The features I would want would be:
- management of all details of an experiment, including parameter sets, datasets, and the resulting data
- ability to "execute" experiments and report their status
- an API for obtaining parameter values and writing out results (available to multiple languages)
- additionally (alternately?) a standard format for transferring data (XDF might be good)
- ability to extract selected results from experimental data
- ability to add notes
- ability to differentiate versions of software
- automatically run experiments over several parameters values
- distribute jobs and data over a cluster
- output to various formats (spreadsheets, Matlab, LaTeX tables, etc.)
- provide a fancy front-end (that can be done separately - I'm thinking mainly in terms of libraries)
- visualize data
- statistical analysis (although some basic stats would be handy)
MAUS and GABE (Score:0, Insightful)
In soviet russia MtnDew Buys Gabe.
Experience (Score:5, Insightful)
I also did lots of comp sci empirical experiments. My experience is that the tools used for experimenting itself is very ad-hoc and not easily scriptable. Most of the times we are required to tend the hour-long experiments to see what happened on the output and decide what to do next. And... the decision is often times not clear cut. Some sort of heuristic is needed. Not to mention about the frustations when the errors occur (especially when the tool is buggy, which is very often in research settings). So, considering this, what I would do is to construct a script and do the experiments in phases. Run it and see the result several days after.
I also noticed that from one experiment to another is sometimes radically different that I would doubt it is easily manageable.
Perl (Score:1, Insightful)
I find it to be an excellent language for maintaining data.
Re:Piracy is Your Only Option (Score:3, Insightful)
4. Profit!!!!
Sorry, I've been reading slashdot too much and must append such an item to all lists I encounter. :P
And it's not stealing, it's copyright infringement. ;)
Seriously, though, I think using commercial software still won't cover all the bases. Alea said, "I can find commercial solutions (often specific to a particular domain)..." which I would assume means that there don't appear to be any general-purpose experiment packages.
As some others have already posted, 'experiments' can cover a wide range of things, and I can imagine that making a general-purpose experiment harness would be a tall order. Having such a thing would be useful for some of the work I do, but I have not had the time (and probably don't have the ability) to try to put together something that can help manage and automate experiments (or sensor data processing jobs, in my case). This is one of those problems which I 'feel' has to have a solution, but I know it's currently beyond my capability to figure out how it should work.