The DNA Data Deluge 138
the_newsbeagle writes "Fast, cheap genetic sequencing machines have the potential to revolutionize science and medicine--but only if geneticists can figure out how to deal with the floods of data their machines are producing. That's where computer scientists can save the day. In this article from IEEE Spectrum, two computational biologists explain how they're borrowing big data solutions from companies like Google and Amazon to meet the challenge. An explanation of the scope of the problem, from the article: 'The roughly 2000 sequencing instruments in labs and hospitals around the world can collectively generate about 15 petabytes of compressed genetic data each year. To put this into perspective, if you were to write this data onto standard DVDs, the resulting stack would be more than 2 miles tall. And with sequencing capacity increasing at a rate of around three- to fivefold per year, next year the stack would be around 6 to 10 miles tall. At this rate, within the next five years the stack of DVDs could reach higher than the orbit of the International Space Station.'"
The problem will solve itself (Score:5, Funny)
To put this into perspective, if you were to write this data onto standard DVDs, the resulting stack would be more than 2 miles tall.
Once that happens, they'll be able to stop storing it on DVDs and move it into the cloud.
Simple. Get the NSA to do it. (Score:5, Funny)
Publish a scientific, paper stating that potential terrorists or other subversives can be identified via DNA sequencing. The NSA will then covertly collect DNA samples from the entire population, and store everyone's genetic profiles in massive databases. Government will spend the trillions of dollars necessary without question. After all, if you are against it, you want another 9/11 to happen.
The answer is obvious! (Score:4, Funny)
They should use a NoSQL multi-shard vertically intgrated stack with a RESTfull rails driven in-memory virtual multi-parallel JPython enabled solution.
Bingo!
Re:Database Replication (Score:4, Funny)
I propose we call this new data method Data Neutral Assembly.
Re:Who uses DVDs? (Score:5, Funny)
Yay, AdEnine & 1 click splicing (Score:4, Funny)