Folding@Home - Yet Another Distributed Client 44
braind writes: "The Stanford group has developed a new way to simulate protein folding ("distributed dynamics") which should remove the previous barriers to simulating protein folding. However, this method is extremely computationally demanding and we need your help.
You can read more on the site." It's interesting seeing all these projects coming out - just a reminder that distributed is still around and we can always use more on our team. *grin* [addendum from timothy:] Note that the SDK used for this project was discussed here a few days ago, so you can even roll -- err, fold -- your own.
Lead on... (Score:1)
Rolling proteins? (Score:1)
A. Fold proteins and get nothing
B. Roll weed and get high
A & B: Fold proteins that make up cannabis!
Weird Pics [64.65.12.73]
Any spare cycles needs to be used (Score:2)
Distributed computing for cash (Score:3)
Sites such as ProcessTree [processtree.com], and others, have been talking of paying for your computer time [slashdot.org] with micropayments, but so far nothing seems to have got off the ground.
Presumably with the added incentive of cash, the number of computers taking part will rocket. Does anyone have any firm information on the progress of these schemes?
Do your own laundry! (Score:1)
(Humorous, folks. Really.)
Re:Distributed computing for cash (Score:1)
I'm eagerly wondering about these, as well. I've got a collection of 'scrounged' computers at home on my network, a copy of Mosix [mosix.org], and the ability to write download scripts for packets...I'm just itching to find out how big my home cluster will have to get to finance a broadband connection for me... :-)
Joe Sixpack is dead!
protein folding is VERY hard to predict (Score:5)
(1) proteins are not static structures, they tend to change conformations in response to stimuli like binding to a ligand, or changes in the electrostatic microenvironment around them.
(2) many proteins don't like to fold in isolation, they require the presence of other proteins that they naturally interact with.
(3) protein sequence is linear (so-called primary structure); while local structural details may be predictable with some reliability (the so-called secondary structure, things like alpha helices and beta sheets), ultimately it is the final 3D fold with long range interactions (tertiary and higher structures) that form the final structure. You can imagine that the longer the protein, the harder it is to fold, due to the increased number of potential tertiary interactions.
determination of the structure of a protein, and even relatively large protein complexes is not as technically challenging as it used to be for biophysicists these days. Tom Steitz's [yale.edu]group at Yale has managed to crystalize and solve the structure of the large ribosomal subunit (a **HUGE** molecule as far as the average biological molecular complex goes) at 2.4 angstrom resolution, which in itself is a monumental feat. I would not be surprised if Steitz is in contention for the Nobel prize for this work.
The holy grail is eventually being able to reverse engineer a protein or ligand that is able to bind to part of a particular protein, using rational design. This is much harder than solving a structure. Pharmaceutical companies would love to be able to design this type of molecule for use as designer drugs, since it would take away much of the cost of R&D through trial and error. Big companies such as Merck basically screen for drugs the way Thomas Edison used to test materials; by having a warehouse full of stuff and testing it all.
That being said, it's still a cool project :)
No Source Code - Should I be Paranoid? (Score:2)
My only problem with it is... (Score:3)
Does anyone know exactly what models they will be using? Because there are only a few ways to actually go about this:
1 - Use a known protein structure that is similar to the one under study, but silghtly different. You can also look for common motifs in a structure / sequence to compare the two. Basically you look at the sequences, and say, "Hey, those two proteins have similar sequences, so they probably look the same too."
2 - Good old ab initio methods where you reduce the conformational energy to the optimal folding pattern. This is basically looking only at the sequence and saying "If I were a protein, what would I look like."
Both are relatively time consuming, but I'm not sure how suited distribution is to this task. The first method requires a great deal of database lookups, and the second requires a lot of computing power under the hood. With distribution, you don't have the database backend to work with, so it must be the brute force method. But I have yet to see any studies where ab initio have been anywhere near a 95% level of accuracy (compared to x-ray crystal structures). The best I've seen is around 75%. This isn't quite as helpful as it might sound. You can get some good results and working models this way, but you can't do a great deal with drug design with an inaccurate model.
They had links to the papers citing their algorithms, but they links were not yet active... If they have a better way to do this, I'll be quite impressed, but for now, I think that a machine like IBM's Blue Gene [ibm.com] has a better chance.
And neither of these methods really takes into account post-translational modifications, phosphorylations, cleavage, activation, etc... (basically all the extra stuff your cells do to proteins before they are "activated").
Re:Distributed computing for cash (Score:5)
Just think about it... you owe how many people three cents? Is that 0.03$US or 0.03$CDN? What about the inscrupulous people SETI and DCTI already have to deal with? These problems (and many others) aren't simple and a handful of MBA's with fists full of seed-money aren't competent to deal with them.
Most of the clones are the ideas of business types. They have little or no computer science or engineering background. To these people, all numbers are preceeded by a dollar sign. Most of them point to SETI as the basis for their business: SETI has zillions of... blah, blah, freakin' blah. They don't understand what SETI is, how it works, or why thousands of people contribute entire offices of machines to the cause. They see that big number and want to plant a `$'!
A few years ago everyone wanted to be an ISP. A year ago everyone wanted to be a "dot-com". A few months ago everyone was chanting IPO -- Redhat stock is where now? Now everyone wants to be an "ASP" and "distributed network"s are all the rage. (Technically, they are all client-server not distributed. They form an easily splintered tree; the clients do not talk to each other. However, like profitability, no one cares.)
How long until quantum medicine? (Score:1)
Be sceptical of computational chemistry (Score:4)
--
Re:No Source Code - Should I be Paranoid? (Score:1)
The risk is the same when running ANY program that is downloaded, regardless of purpose.
The only way to be mostly sure, is to audit the source of every program you run, then compile and build yourself. Even then, you have to worry about the compiler, as the infamous compiler trojan [acm.org] illustrates, as described by Ken Thompson.
Re:protein folding is VERY hard to predict (Score:1)
This took them a long time to do... and this isn't exactly a good model to work from. True, this is a huge model, and is great work. I personally think that the odds are good that a Nobel prize is in the picture. Simply obtaining the protein took a long time, and the crystallization was the hard part.
But this ribosome is not the same as that of a human. I'm not sure which one this is, probably related to T. aquaticus or something like that (some thermophile), so it is going to be very different than anything coming out of the Human Genome project.
Re:No Source Code - Should I be Paranoid? (Score:1)
Re:Be sceptical of computational chemistry (Score:2)
Re:My only problem with it is... (Score:1)
Finally... (Score:1)
Re:Be sceptical of computational chemistry (Score:2)
--
Re:Peer Review (Score:3)
From their site: [stanford.edu]
Presumably if you volunteer to port to system x they'll have to let you see the source code. They might even let you see it if you ask nicely for all I know.
As for SETI, I don't know if their code is available at all (I think not --at least officially); but I know they do not want any unofficial versions around and that they've even refused assistance to produce versions optimized for the 3DNow extensions in AMD chips (none exist now AFAIK).
What they really need to find out (Score:1)
Re:Finally... (Score:4)
yea, you're not alone and we do have one (for linux and windows): check out the Folding@home site [stanford.edu] and go to the download page [stanford.edu], sign up, and then download.
Re: (Score:2)
brute-force computing (Score:1)
IBM has an protein folding intiative called Blue Gene that was reported on back in Dec. 1999.
CNet's article is here [cnet.com], and IBM's is here [ibm.com].
Re:What they really need to find out (Score:1)
Re:client problems (Score:2)
Re:How long until quantum medicine? (Score:1)
Right. It's imperative that we go straight to the source -- and heal those sick, neglected elementary particles! There's nothing more dangerous than an electron with an advanced case of malaria.
Linux client works under FreeBSD (Score:1)
What a Distributed Net SHOULD be. (Score:1)
There should be a way to messure an estimate the amount of time it would take to do a calculation, or the number of cycles it takes to complete. If it's say, 10 million cycles or more, send off a request for spare cycles!
Obviously there are some real problems with my dream world idea here: Network latency and bandwith problems. Imagine if everycomputer in the world was this way, and they used the internet to chatter the information... Wow, the bandwith would be sucked up pritty quickly!
Okay, enough of a weird idea. I've heard of selling cpu cycles, but I'm more intrested in a common pool.
Re:Distributed computing for cash (Score:1)
well, it would be kinda cool if the military turned to distributed.net to simulate their new weapons (and see the images as they develop)...
i have no morals...
Is it worth my CPU cycles? (Score:1)
This project sounds awesome, but my concern is how do I know that they are not wasting my CPU cycles for nothing?
Having a lot of computer power is not going to help them solve anything,if theory underlying their simulation or algorithm being used for folding is incorrect.
Re:My only problem with it is... (Score:1)
I usually see this process as requiring a great deal of feedback before proceeding to the next step. For example, if R23 moves by 2.5 angstroms, how this affects to torsional strain placed upon Q102. This seems more problematic since you are essentially trying to optimize an entire model from scratch.
I guess the big concern is how you modularize the algorithm. The work of one process is directly related to the work of another, and I find it difficult to see how you can proceed without direct feedback from one process to another. Again, I think a citation would be helpful, especially to help understand how the algorithm is compartmentalized.
I can see how you could have one process calculating the thermodynamic energy of one model versus another, but then the question is, how do you choose to modify the model? Is it a matter of checking all possibilities and choosing the one with the lowest energy? This could be distributed easily, but how efficient is it?
Re:Be sceptical of computational chemistry (Score:1)
Folderol (Score:1)
Actually not a screensaver (Score:1)
Re:Peer Review (Score:1)
Re:protein folding is VERY hard to predict (Score:2)
There is hope in some algorithms (such as DPMTA [duke.edu]) which intelligently partition large groups of particles to simplify the computation of long-range forces:
Hopefully the folding@home folks are aware of such algorithms, and are using them to reduce the need for inter-client communication. By farming out as much of that computation as possible to the clients, they minimize the reliance on their non-scalable server CPU, and they also effectively slow down the clients a little, postponing the day when they find themselves hopelessly bandwidth-bound.Re:Be sceptical of computational chemistry (Score:1)
I don't think these people [psc.edu] are wasting CPU time.
Doing this [psc.edu] distributed over the Internet, however, is unlikely.
Re:Be sceptical of computational chemistry (Score:1)
Proteins are built up out of twenty standard amino acids, and it turns out that if you can make a model which describes the behaviour of individual amino acids well (with reference to quantum calculations, or experimental data), then you can describe their collective behaviour in proteins quite well too.
The Amber [ucsf.edu], CHARMM [harvard.edu], and GROMOS [c4.ethz.ch] parameter sets for doing this are quite refined, and simulations using these parameters appear to agree pretty well with reality.
The big problem is that, as the project pages mention, computer simulations of proteins have only recently hit the 1 microsecond range. What they don't tell you is that many common-or-garden proteins fold on a millisecond, second, or longer timescale. That's a factor of a million you have to brute-force your way through. A simulation also deals with one protein molecule at a time, while nature tends to fold a couple of billion of them at once, so it doesn't matter if a few don't quite make it to the correct fold in a reasonable time.
Re:No Source Code - Should I be Paranoid? (Score:1)
Protein Folding May Be Heritable Via Prions (Score:2)
Re:No Source Code - Should I be Paranoid? (Score:1)
It is for this reason that they only release the source to their core. And not to their network interface. This is an attempt to prevent a script kiddie from tweaking the source to boost their stats. In order to do this you actually have to reverse engineer the client/server communications protocol.
Re:How long until quantum medicine? (Score:1)
Re:I was just wondering... (Score:1)
Who "owns" the results? What will happen to them?
Unlike other distributed computing projects, Folding@home is run by an academic institution (specifically the Pande Group, at Stanford University's Chemistry Department), which is a non-profit institution dedicated to science research and education. The results from Folding@home will be made available on several levels. First, we put movies and images of all folding runs on the web for everyone to see...