Machine Learning Reveals Genetic Controls 14
An anonymous reader writes with this quote from Quanta Magazine:
Most genetic research to date has focused on just 1 percent of the genome — the areas that code for proteins. But new research, published today in Science, provides an initial map for the sections of the genome that orchestrate this protein-building process. "It's one thing to have the book — the big question is how you read the book," said Brendan Frey, a computational biologist at the University of Toronto who led the new research (abstract).
For example, researchers can use the model to predict what will happen to a protein when there’s a mistake in part of the regulatory code. Mutations in splicing instructions have already been linked to diseases such as spinal muscular atrophy, a leading cause of infant death, and some forms of colorectal cancer. In the new study, researchers used the trained model to analyze genetic data from people afflicted with some of those diseases. The scientists identified some known mutations linked to these maladies, verifying that the model works. They picked out some new candidate mutations as well, most notably for autism.
One of the benefits of the model, Frey said, is that it wasn’t trained using disease data, so it should work on any disease or trait of interest. The researchers plan to make the system publicly available, which means that scientists will be able to apply it to many more diseases.
For example, researchers can use the model to predict what will happen to a protein when there’s a mistake in part of the regulatory code. Mutations in splicing instructions have already been linked to diseases such as spinal muscular atrophy, a leading cause of infant death, and some forms of colorectal cancer. In the new study, researchers used the trained model to analyze genetic data from people afflicted with some of those diseases. The scientists identified some known mutations linked to these maladies, verifying that the model works. They picked out some new candidate mutations as well, most notably for autism.
One of the benefits of the model, Frey said, is that it wasn’t trained using disease data, so it should work on any disease or trait of interest. The researchers plan to make the system publicly available, which means that scientists will be able to apply it to many more diseases.
cis and mi regulation is not "bad" code (Score:3, Interesting)
See, the problem is many of you don't get that what you think of as "noise" in the DNA is actually code. Shifted code. The internal mechanisms use cis regulation and miRNA, mRNA, cRNA to adapt to things going on in the environment.
It's not noise code, or broken code.
It's designed to do that.
If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.
Re: (Score:2)
Bioinformaticians are very much math orientated, and almost all of them code. Their focus has been driven by commercial interest however genetic scientists have been saying that non-coding DNA is functional and regulatory just as epigenetic effects exist. For decades.
Re: (Score:3)
For small genomes, yes, but for large genomes, there is a lot of "unused" material.
Only about 6-10% of the human genome is transcribed into RNA, either protein the coding kind or non-coding types used in regulation. (small genomes are almost always entirely coding and even include overlapping coding regions, large genomes are the ones that have "junk" DNA in them)
Transcription is most closely related to a processor reading machine code and doing something with it. In a computer program, we know that we can
Re: (Score:1)
There's a good paper on how a lot more DNA is used than we think, in the next issue of Cell.
Re: (Score:2)
I thought that was what histones were for. DNA that's wrapped can't be read, so you control what is wrapped to decide what is available for expression. And epigenetic tags freeze or thaw the wrapping. This requires sections of DNA that function as labels, but it doesn't directly control the folding (more accurately rolling into a cylinder) that's handled by the histones, and when they decide to roll it up is decided by what tags are attached to the labels.
Re: (Score:1)
If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.
A better analogy would be a huge bloated computer program that evolved over many decades - where changing (or removing) one little thing in one place can break things in dozens of other unexpected places - but where if you were to rewrite the entire thing from scratch you could reduce the size of the code base by a factor of a hundred while still preserving all the functionality (and also eliminating lots of bugs).
Very few biologists would imagine that you could go through the human genome and excise all th
Re: (Score:2)
If anyone had taken assembler and machine coding back in the old days of computing, they'd get it. You only have so much to code with, so you make it do multiple things.
A better analogy would be a huge bloated computer program that evolved over many decades - where changing (or removing) one little thing in one place can break things in dozens of other unexpected places - but where if you were to rewrite the entire thing from scratch you could reduce the size of the code base by a factor of a hundred while still preserving all the functionality (and also eliminating lots of bugs).
Very few biologists would imagine that you could go through the human genome and excise all the "junk" regions and still end up with a healthy human. But many would agree that some hyper-intelligent entity could almost certainly design a new species that looked and acted human but with a genome that was a hundred times smaller.
No doubt we were intelligently designed to appear to have been the result of thousands and thousands of years of trial and error for some mysterious reason that is beyond the comprehension abilities of us mere mortals.
Or maybe we were intelligently designed with all that extra "code" so as to be able to evolve should it become necessary.
I have an unshakeable, almost religious faith in the ID proponents ability to come up with some sort of explanation of how evolution never happened because pocketwatches.
Re: (Score:2)
And it's nothing unusual in the animal world. The difference is even more glaring in plants - a good old Arabidopsis is just 135Mbp and Paris Japonica is 150GBp. That's a difference of three orders of magnitude between plants that have no really speci
WhatCouldPossiblyGoWrong (Score:2)
We let the machines reverse engineer the homo sapien genome. Next step, a vaccine to eradicate the infection from the planet.
Junk DNA (Score:3, Insightful)
Um, I'm autistic, and... (Score:1)
I don't really feel like I have any sort of disease.
I love how everybody but autistic people want to cure autism.
I mean we very seldom lie or use subterfuge, and yet WE'RE the ones that need to be cured? And what about the awesome abilities some of us have with math and various things? For all anyone knows, we're the next step on the evolutionary ladder and the only cost is a little special attention when we're children.
Re: (Score:2)
"WE'RE the ones that need to be cured?"
No, probably not you. You do realize that Autism is a spectrum disorder right? I have a couple very close friends who are diagnosed with a mild form and a few who are not diagnosed but probably could be. They surely do NOT need "cured". I also have friends who have worked as caretakers for people who had it so bad they were unable to communicate, dress themselves, etc.. and will never know a day of independance.
"For all anyone knows, we're the next step on the evolut