Grid Computes 420 Years Worth of Data in 4 Months 166
Da Massive writes with a ComputerWorld article about a grid computing approach to the malaria disease. By running the problem across 5,000 computer for a total of four months, the WISDOM project analyzed some 80,000 drug compounds every hour. The search for new drug compounds is normally a time-intensive process, but the grid approach did the work of 420 years of computation in just 16 weeks. Individuals in over 25 countries participated. " All computers ran open source grid software, gLite, which allowed them to access central grid storage elements which were installed on Linux machines located in several countries worldwide. Besides being collected and saved in storage elements, data was also analyzed separately with meaningful results stored in a relational database. The database was installed on a separate Linux machine, to allow scientists to more easily analyze and select useful compounds." Are there any other 'big picture' problems out there you think would benefit from the grid approach?
Re:Malaria? (Score:5, Informative)
http://archive.idrc.ca/books/reports/1996/01-07e.
Malaria kills quite a few people every year so I don't think it's a waste.
~S
Lots of things still out there (Score:5, Informative)
Re:Wow, 25% scalability! Amazing! (Score:5, Informative)
It's over 4 months, not a fraction of a second.
If I have a task that takes 100 seconds to run and I want it completed in under a second, scalability becomes a challenge... I have to figure out how to break it in to at least 100 distinct parts and deal with all of the communication lags associated. To have any kind of fault tolerance, I probably want to break it in to at least 1,000 tasks so that if one processor is running fast, it can get fed more and if one processor corrupts its process, I don't find out right at the end of the second, with no room to compensate, that I have to run re-run that full second's worth of processing elsewhere to make up for it. That's where the challenge comes in.
If I have a task that takes 100 seconds to run and all I'm trying to do is run it a lot of times over a period of time that's many times greater, I can run it 864 times a day per system with absolutely no scalability issues whatsoever and simply send the relatively small complete result sets back. With 100 systems, if each one can run a distinct task from start to finish, I'd be expecting pretty much dead on 100 times the total number crunching as there are absolutely no issues with task division, synchronization or network lag.
In this case, they ran 5,000 computers over 4 months. Assuming a single task is solvable in under 4 months by a single system, they should have had no difficult task division problems to solve, absolutely minimal synchronization issues and next to no lag issues to address. In short, even a pretty inefficient programmer should be able to approach 1:1 scalability in that easy of a scenario.
Efficiency of algorithms is a challenge when you want a single result fast. When you want many results and are prepared to wait so long as you're getting very many of them, that's an incredibly easy distributed computing problem.
Grid computing vs distributed computing projects (Score:2, Informative)
Re:Malaria? (Score:2, Informative)
Re:Malaria? (Score:3, Informative)
Re:Wikipedia? (Score:2, Informative)
Fallacy of the One Biggest Problem (Score:3, Informative)
There seems to be a widespread fallacy that all human resources should be applied to the One Biggest Problem facing humanity at any given moment. Overlooking for a moment the obvious problems inherent in trying to choose the One Biggest Problem, and assuming we could actually rank all human problems in a well-defined order, there are still two huge problems with this approach:
1. Diminishing returns. Putting twice as many people on a problem doesn't solve it twice as quickly. The extra people could well be more productive working on a separate problem. This is the well-known fallacy of the Mythical Man-Month.
2. Misplaced priorities. The majority of people in the world do not have cancer. If all the resources of humanity were spent on cancer, where would that leave the rest of us that don't have cancer? "Sorry, we've stopped making antibiotics, insulin, toothpaste, books, and clothing so we can focus on fighting cancer."
In addition, there's an implicit assumption in the parent poster's position that the researchers who are looking for a cure for malaria have been wasting their time. I'd like to ask, what has *he* been doing during this time? I hope he has been looking for a cancer cure, or else he's nothing but a hypocrite.