Google Turns AlphaFold Loose On the Entire Human Genome (arstechnica.com) 20
An anonymous reader quotes a report from Ars Technica: Just one week after Google's DeepMind AI group finally described its biology efforts in detail, the company is releasing a paper that explains how it analyzed nearly every protein encoded in the human genome and predicted its likely three-dimensional structure -- a structure that can be critical for understanding disease and designing treatments. In the very near future, all of these structures will be released under a Creative Commons license via the European Bioinformatics Institute, which already hosts a major database of protein structures. In a press conference associated with the paper's release, DeepMind's Demis Hassabis made clear that the company isn't stopping there. In addition to the work described in the paper, the company will release structural predictions for the genomes of 20 major research organisms, from yeast to fruit flies to mice. In total, the database launch will include roughly 350,000 protein structures.
[...]
At some point in the near future (possibly by the time you read this), all this data will be available on a dedicated website hosted by the European Bioinformatics Institute, a European Union-funded organization that describes itself in part as follows: "We make the world's public biological data freely available to the scientific community via a range of services and tools." The AlphaFold data will be no exception; once the above link is live, anyone can use it to download information on the human protein of their choice. Or, as mentioned above, the mouse, yeast, or fruit fly version. The 20 organisms that will see their data released are also just a start. DeepMind's Demis Hassabis said that over the next few months, the team will target every gene sequence available in DNA databases. By the time this work is done, over 100 million proteins should have predicted structures. Hassabis wrapped up his part of the announcement by saying, "We think this is the most significant contribution AI has made to science to date." It would be difficult to argue otherwise. Further reading: Google details its protein-folding software, academics offer an alternative (Ars Technica)
[...]
At some point in the near future (possibly by the time you read this), all this data will be available on a dedicated website hosted by the European Bioinformatics Institute, a European Union-funded organization that describes itself in part as follows: "We make the world's public biological data freely available to the scientific community via a range of services and tools." The AlphaFold data will be no exception; once the above link is live, anyone can use it to download information on the human protein of their choice. Or, as mentioned above, the mouse, yeast, or fruit fly version. The 20 organisms that will see their data released are also just a start. DeepMind's Demis Hassabis said that over the next few months, the team will target every gene sequence available in DNA databases. By the time this work is done, over 100 million proteins should have predicted structures. Hassabis wrapped up his part of the announcement by saying, "We think this is the most significant contribution AI has made to science to date." It would be difficult to argue otherwise. Further reading: Google details its protein-folding software, academics offer an alternative (Ars Technica)
Their results are not on the entire human genome (Score:4, Interesting)
Two fo the largest proteins, titin and nebulin are not represented in their current data set.
Titin is around ~30,000 amino acids and the current database has a small titin-like sequence from
the rat genome. Apparently their method only works with small globular proteins.
So....? (Score:2)
Does that mean we're getting close to real-life kawaii nekomimi?
Re: (Score:2)
> Does that mean we're getting close to real-life kawaii nekomimi?
I misread the program's name as AlphaFood, and started to worry about the age of Long Pig.
AI and Google (Score:1)
Re: Better viruses in the future through bioengine (Score:1)
As do cretins
Prediction != Ground Truth (Score:4, Informative)
IIUC, these are high-confidence predictions, but not the ground truth. It's a step up from exhaustive searches that have been used for decades.
It's highly likely to reduce the problem space by an amazing amount, which is awesome, but we're still operating in best-guess situation. They are good guesses though.
Re: (Score:3, Interesting)
IIUC, these are high-confidence predictions, but not the ground truth. It's a step up from exhaustive searches that have been used for decades.
It's highly likely to reduce the problem space by an amazing amount, which is awesome, but we're still operating in best-guess situation. They are good guesses though.
But that is a very good guess. In the recent CASP competition (competition of prediction of protein structures from the sequence), for more than than two-thirds of the competition entries the accuracy of prediction was similar to the experimental accuracy (X-ray crystallography). While it is just a guess, it's in many cases a really good guess. https://fortune.com/2020/11/30... [fortune.com]
Just saying, but... (Score:4, Funny)
...can we not have stories titled "Google Turns xyz Loose On the Entire Human Genome" in the future?
It's been a rough 18 months.
Thanks.
Re: (Score:3, Funny)
Re: (Score:2, Interesting)
In this particular case, by buying the British company concerned (DeepMind).
Trending (Score:3)
The last century was the century of electronics. This century will be one of microbiology and genetics. Though America might be about to get left behind if Rand Paul has his way.