In a Major Scientific Breakthrough, AI Predicts the Exact Shape of Proteins (fortune.com) 62
Researchers have made a major breakthrough using artificial intelligence that could revolutionize the hunt for new medicines. The scientists have created A.I. software that uses a protein's DNA sequence to predict its three-dimensional structure to within an atom's width of accuracy. weiserfireman shares a report: The achievement, which solves a 50-year-old challenge in molecular biology, was accomplished by a team from DeepMind, the London-based artificial intelligence company that is part of Google parent Alphabet. Until now, DeepMind was best known for creating A.I. that could beat the best human players at the strategy game Go, a major milestone in computer science. DeepMind achieved the protein shape breakthrough in a biennial competition for algorithms that can be used to predict protein structures. The competition asks participants to take a protein's DNA sequence and then use it to determine the protein's three-dimensional shape. Across more than 100 proteins, DeepMind's A.I. software, which it called AlphaFold 2, was able to predict the structure to within about an atom's width of accuracy in two-thirds of cases and was highly accurate in most of the remaining one-third of cases, according to John Moult, a molecular biologist at the University of Maryland who is director of the competition, called the Critical Assessment of Structure Prediction, or CASP. It was far better than any other method in the competition, he said.
Google? That makes me wonder (Score:2)
how they cheated.
Re:Google? That makes me wonder (Score:4, Insightful)
One very google-y way to cheat on this though - especially as RCSB [rcsb.org] continues to grow at a very rapid clip - is to start with homology. Find the closest protein to the one you have been given - after all you're starting with primary sequence - and then map it to there and refine after that. This is a widely accepted way to go about it - hence not really cheating - but also something that google would be expected to be really good at.
Soon ... (Score:5, Funny)
Re: (Score:2)
Comment removed (Score:5, Funny)
Re: (Score:2)
Get off my lawn.
Re: (Score:2)
An offline version of Google Maps for the time when you're stuck in the middle of nowhere without a data connection.
Or until a few years ago, when traveling (because data roaming charges were extremely high). Sure you could get the map part, but actual routing and turn by turn requires data access.
Still a reason why standalone GPS units still exist since the routing and turn by turn take place on device. (But of course, lacks the construction zone routing and frequent updates that hopefully try to avoid put
Re: (Score:2)
Offline version [google.com] of Google Maps? Wow, Google's AI was predictive enough to know we would be posting about it! I'm pretty sure it has the turn-by-turn directions now too.
If only they could have predicted that people might actually want to use it widely and added a MicroSD slot to their phones and tablets to store those (decently-sized) maps.
Re: (Score:2)
Re: (Score:2)
What about it?
That was fine and all but now my phone folds.... wth!!
Re: (Score:2)
Re: (Score:2)
You know when you print out a Google map? Then you print out separate sections and tape them together?
Something like that
Re:Soon ... (Score:4, Funny)
By driving the car while you do it.
Re: (Score:2)
Re: (Score:1)
I just stepped on it. Worked every time! Downside is my dates would look at me funny, but I got used to that from other things.
Re: (Score:2)
It's been a while since I had to do that. Brings back memories. GPS has really changed the world. Do gas stations even carry road maps anymore?
Re: (Score:2)
Very few do, mostly truck stops. AAA is one of the few places to get actual paper maps any more.
Really need more information here (Score:5, Interesting)
Re:Really need more information here (Score:4, Insightful)
This was CASP14, it's legit. It includes transmembrane proteins.
Re:Really need more information here (Score:4, Interesting)
This was CASP14, it's legit. It includes transmembrane proteins.
CASP14 is a good data set, for sure. However we don't know how the Google AI fared on the transmembrane proteins from the data set, as we don't know which proteins it did really well on and which ones it did not do well on. They're patting themselves on the back for what they did - which certainly they did well with - but they aren't saying what is in each set.
Here you go (Score:5, Informative)
CASP competition doesn't classify proteins into various types. They only categorize based on type of prediction (e.g. ab initio versus homology modeling etc). But they do list the actual proteins that were the targets for prediction. See list here.
https://www.predictioncenter.o... [predictioncenter.org]
Re: (Score:2)
I'm more concerned that you can hide most anything behind "AI" in this context. You could take the best "non-AI" method, put a optimized arbitrary algorithm (= trained machine = AI) on top of it with some data added in, and you'd get something "better". But I don't think it is really more than the sum of its parts. i.e. it's not a predictive model, it's just a rule of thumb for adjusting the output of an actual predictive model, with real physics and chemistry in it, to get the observed structures in nat
We found ... (Score:2)
... the egg!!
Re: (Score:1)
I'll donate proteins! I make plenty while...um
Folding at Home? (Score:3)
Geek minds want to know. How does this impact Folding @ Home [foldingathome.org]?
Re: (Score:2)
I expect all the work units will soon be GPU/NPU-accelerated neural net programs, and progress will be far faster.
So... (Score:3)
Folding@home is dead?
Re: (Score:2)
No, assuming Deepmind shares .. folding @home will become even better and more useful making it even more critical.
Re: (Score:2)
I wouldn't say dead. But they can focus on other challenges more amenable to distributed computing and first principle based algorithms (e.g. docking, complex prediction etc). I would think Deep Mind could go after protein-protein complexes as well actually.
Comment removed (Score:3)
Re:My question is..., (Score:5, Informative)
That shouldn't be particularly difficult. In fact, it's likely that the websites you visit can identify you as you, individually.
The reason you have to do stupid captchas isn't technological. It's Google getting some work units out of you increasing the size of their training set.
Re: (Score:2)
Re:My question is..., (Score:4, Funny)
This will be good data for the emotion engine.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Ha ha, thanks for the setup. I'll add Wayne and Shuster because I'm Canadian and old enough that their end of career specials were on when I was a kid. Sadly, there's not much "on the road" happening right now.
Re: (Score:2)
Re: (Score:2)
Reunion tour in 10?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Lol. It's been a ride.
Re: (Score:2)
Beware... (Score:4, Insightful)
Re: (Score:2)
Your experience reflects very much the theory. ML theory guaranties generalization capabilities under some assumption on the training process that we cannot have in practice. For example, the training examples have to be independent and identically distributed, which is never the case in practice be cause we tend to collect data in clumps, that is, in a non random way. That alone tends to create an imbalance between areas where we have many observations (w.r.t. the natural distribution) and areas where we d
First Principles (Score:2)
Re: (Score:3)
You are correct. It is not practical. The current energy functions are not perfect. Even if you assume that they are good enough, the compute power needed doesn't exist yet. When I was in grad school I was of the feeling that only quantum computing can solve this problem (because in reality, protein folding is a kind of quantum computational problem, which is just collapsed into a good enough solution - or so I thought).
Background (Score:2)
Some background is missing here. The shape of a protein is very relevant, since it determines how it interacts with other elements, i.e. its behavior and function. The shape depends on the specific sequence of aminoacids that compose the genetic sequence of the protein. However, simulating the folding of the molecules to get the final shape is a very complex and resource-intensive problem.
IBM developed the Blue Gene [wikipedia.org] supercomputers 2 decades ago motivated, in part, by this complex simulations (the other moti
Re: (Score:3)
Blue Gene etc were being applied to fold proteins based on first principles (i.e. physics and numerical methods). Deep Mind however has side-stepped that whole process of solving through fundamental understanding and got to the solution. The good things however are:
(1) It does use some of our fundamental learnings about protein structure.
(2) We get to solve more applied problems, leaving the physics based methods to continue to develop, which will probably have other applications (like de novo design of cat
Re: (Score:1)
It seems to me it should be relatively simple, what I am missing? You'd have a two-column "rule list" where each aminoacid joint (pair) produces "left turn 30 degrees", "right turn 62 degrees", etc. (i.e. vectors.) I imagine sometimes the structure would "bump into" itself, but handling that is just part of the simulation. What's an example of the complexity bottleneck(s)?
Re: (Score:1)
Correction, 3 columns: 1) Amino-acid type "A"; 2) Amino-acid type "B"; 3) 3D "turn" vector relative to "A" (maybe with an offset distance).
Re: (Score:3)
Re: (Score:1)
I'm not clear on where the uncertainty is. The sequence of the proteins are all known, aren't they? The resulting angles between acid pairs are all known, correct? If reality doesn't match the look-up model, why?
Re: (Score:2)
Spaghetti on the wall (Score:2)
I have to imagine anyone with the resources is throwing neural networks at every remotely promising problem, simultaneously. I think we're just seeing the early trickle, the remaining low-hanging fruit that had somehow been missed. It remains to be seen how much of that there is to pick off before the problems escalate in difficulty to another "can't touch this" level, requiring still more hardware advances.
External links? (Score:2)
Is it some kind of Fortune policy where *every link* only points to other Fortune articles? How about, IDK, a link to the actual paper, or the competition in which this occurred, or,....anything?
A little Google-fu turns up this link on Google's AI blog: https://deepmind.com/blog/arti... [deepmind.com].
And here are the CASP14 competition results just released: https://predictioncenter.org/c... [predictioncenter.org]
Soon... (Score:3)
Next AI folding challenge (Score:4, Funny)
Laundry!
old news (Score:1)