Meet Evo, the DNA-trained AI That Creates Genomes From Scratch (science.org) 9
sciencehabit shares a report from Science Magazine: What if, rather than scouring the internet, ChatGPT could search all of the DNA on Earth? That future just got a bit closer with Evo, an AI model reported today in Science. The program -- trained on billions of lines of genetic sequences -- can design new proteins and even whole genomes. Previous AIs could only interpret and predict relatively short sections of DNA, and they could only work with groups of nucleotides -- the A, C, G, T alphabet of DNA -- not individual nucleotides. To take things to the next level, researchers trained Evo on 300 billion nucleotides of sequence information.
In a first test, Evo bested other AI models on predicting the impact of mutations on protein performance. The team then had Evo design new versions of the CRISPR genome editor; the best designs were as good at cutting DNA as a commercial version. And in what study author Brian Hie, a computational biologist at Stanford University, calls the "most futuristic and crazy" part of the study, the researchers asked Evo to generate DNA sequences that are long enough to serve as genomes for bacteria -- a step toward AI-designed synthetic genomes.
Much of the work on AI occurs in secret at companies. But the researchers have released Evo publicly so that other researchers can use it, and Hie says the team has no plans to commercialize its creation. "For now, I see this as a research project."
In a first test, Evo bested other AI models on predicting the impact of mutations on protein performance. The team then had Evo design new versions of the CRISPR genome editor; the best designs were as good at cutting DNA as a commercial version. And in what study author Brian Hie, a computational biologist at Stanford University, calls the "most futuristic and crazy" part of the study, the researchers asked Evo to generate DNA sequences that are long enough to serve as genomes for bacteria -- a step toward AI-designed synthetic genomes.
Much of the work on AI occurs in secret at companies. But the researchers have released Evo publicly so that other researchers can use it, and Hie says the team has no plans to commercialize its creation. "For now, I see this as a research project."
Like AI itself (Score:2)
We can see the promise, we can visualize what *could* be done. But making it reliably do what humans intend for it to do...that's going to be a long, long, hard road.
A synthetic genome is nothing more than a spreadsheet, until there is a way to create the actual DNA, and until we see that DNA come alive.
Very cool accomplishment, lots of steps to go.
Next up (Score:2)
Coming soon (Score:3)
The next version.... (Score:1)
https://www.imdb.com/title/tt0... [imdb.com]
Life creation? (Score:2)
All that life DNA information is an almost limitlessly 'fertile' dataset for AI to consume if this company is able to tailor it for training appropriately.
Able to "generate DNA sequences that are long enough to serve as genomes" is pretty vague, but it seems like they could eventually be able to generate sequences that actually do behave as genomes. Some form of virus might be relatively simple to start off with since they hijack the cellular replication machinery.
I have to say though that it gives me the s
Re: (Score:2)
>Some bad nerds could wreak havoc.
Why bother worrying about artificial wombs or cloning vats when you can put a droplet of instructions in a vat of goo and let it go through a few billion years of pre-programed evolution over the course of days?
The real only question is if my first vat army will be soldiers or sex toys.
This may be foundational for novel bioweapons (Score:1)
Not a criticism of the work — this is a reasonable development of current technology, but on the lines of "Oh. We may well remember this project."
In the article, the authors state "For safety considerations, we excluded viral genomes that infect eukaryotic hosts." That's nice of them — it makes it more difficult to use their published model to create bioweapons. This is, of course, a (more or less) publicly-funded research project, whose code and data are fully available and can be adapted furth