Forgot your password?
typodupeerror
Biotech Science

New Method To Revolutionize DNA Sequencing 239

Posted by ScuttleMonkey
from the start-saving-up-to-buy-a-clone dept.
An anonymous reader writes "A new method of DNA sequencing published this week in Science identifies incorporation of single bases by fluorescence. This has been shown to increase read lengths from 20 bases (454 sequencing) to >4000 bases, with a 99.3% accuracy. Single molecule reading can reduce costs and increase the rate at which reads can be performed. 'So far, the team has built a chip housing 3000 ZMWs [waveguides], which the company hopes will hit the market in 2010. By 2013, it aims to squeeze a million ZMWs [waveguides] onto a single chip and observe DNA being assembled in each simultaneously. Company founder Stephen Turner estimates that such a chip would be able to sequence an entire human genome in under half an hour to 99.999 per cent accuracy for under $1000.'"
This discussion has been archived. No new comments can be posted.

New Method To Revolutionize DNA Sequencing

Comments Filter:
  • 99.3% accurate? (Score:5, Insightful)

    by Valdrax (32670) on Monday January 05, 2009 @02:28PM (#26332679)

    That's, what, 28 incorrect base pairs out of 4000? I'm not a biologist, but is this considered an acceptable error rate? Even the hopes of 99.999% accuracy seems really awful when there are about 3 billion base pairs in a human genome.

    I realize that we aren't going to be trying to make a cloned copy from this data, but what uses is this "good enough" for?

    • Re:99.3% accurate? (Score:5, Insightful)

      by imamac (1083405) on Monday January 05, 2009 @02:31PM (#26332727)

      I realize that we aren't going to be trying to make a cloned copy from this data...

      What makes you so sure? Who knows where this will lead?

    • It's not too bad. I don't think the human version of the polymerase has a better error rate. However, while being in a biological entity, DNA replication also has other integrity checks.

    • Re: (Score:2, Insightful)

      by aoeusnth (101740)

      That's, what, 28 incorrect base pairs out of 4000? I'm not a biologist, but is this considered an acceptable error rate? Even the hopes of 99.999% accuracy seems really awful when there are about 3 billion base pairs in a human genome.

      I realize that we aren't going to be trying to make a cloned copy from this data, but what uses is this "good enough" for?

      More than good enough for forensic work at least, I'd wager.

      • At the very least, this method can be a cheap way to acquit suspects. Those that come up positive can ask for the more accurate test.
    • Re:99.3% accurate? (Score:5, Interesting)

      by Maximum Prophet (716608) on Monday January 05, 2009 @02:34PM (#26332789)
      How many errors are introduced during normal human reproduction? The dogs they've cloned so far are less than 99.999% identical.
      • Re:99.3% accurate? (Score:5, Informative)

        by peter303 (12292) on Monday January 05, 2009 @03:22PM (#26333547)
        One in 10E8 is the DNA base-pair copy error rate. Even so thats around 60 when a sperm meets egg. Another much more when there a trillion somatic cells dividing on average 50 times each in a human lifetime. The vast majority are errors are neutral, but accumulating ten or so specifically unluckly ones in a cell may be a cancer.
        • by Fluffeh (1273756)
          While finding these error rates and posts like the parent are cool, what made me really LAUGH in this article wasn't how many errors or lack of errors or the timing, but that even working out to 2013, they knew exactly how much they were going to charge for this.

          Talk about medicine for healing rather than profiteering eh?
          • If the article had stated that the cost were $1,000,000 to do the sequence, then the potential applications of the technology would be severely limited. Getting the cost (not the charge for the service) down creates the opportunity for more studies to be performed, more financial accessibility for patients, and less resistance for insurance companies or Medicare to deny charging for the study when it's indicated.

            In medicine, the cost of a study, as well as its reliability, availability, and predictive val

      • by aliquis (678370)

        Little I hope, the girl who wants to mate with me has to be really retarded and I don't want that to affect our kids :D

    • Re:99.3% accurate? (Score:4, Informative)

      by Anonymous Coward on Monday January 05, 2009 @02:35PM (#26332813)

      It's common practice in bioinformatics to measure the same data repetitively in an effort to reduce the error. While 0.993 isn't very good, (0.993)^3 is pretty awsome. In practice, the errors might be correlated (as in a flaw in the measuring system), so the benefit of re-measuring might not be exponential...however it should be darn close.

      • That formula isn't quite accurate.. lets say the first sequencing you got A and in the second you get B. You don't know which one was the error, so you'd have to test enough times that you felt confident about what the real value was. A minimum of 3 tests would be necessary to even be fairly sure about the result, but more than that would be needed to have really high accuracy.
        • Re: (Score:3, Funny)

          by evilNomad (807119)

          If you got B on the second run you'd be pretty sure it was incorrect.. ;-)

          • The point is, which pieces are incorrect? It's highly likely to be slightly incorrect, but you want it to be highly likely to be completely correct.
          • Ha! Should have stuck to actual base pairs in my example...
        • Re:99.3% accurate? (Score:5, Insightful)

          by scorp1us (235526) on Monday January 05, 2009 @02:53PM (#26333061) Journal

          There is a saying from the old sailing days. "Never set sail with two compasses". One is ok, three is better. But never two. The paralysis from not knowing which is right is far worse than being wrong and correcting later.

        • by shaitand (626655) on Monday January 05, 2009 @03:13PM (#26333395) Journal

          If you were sequencing DNA and got a B then you'd seriously need to recheck the equipment (or the competence of the operator). Perhaps a T or a G, or even a C but never a B.

        • Re: (Score:3, Interesting)

          This assumes that the method simply has a random chance of getting each data point wrong. What if it is something systematic with the method that causes it to read one gene wrong? In other words, it reads the gene as a 'T' every time despite it really being an 'A'. No matter how many tests you run, it will still result in a wrong answer.

          • Re: (Score:2, Informative)

            by Anonymous Coward

            If you RTFP (requires subscription), no systematic errors were detected
            http://www.sciencemag.org/cgi/content/full/323/5910/133

          • nitpick (Score:3, Informative)

            by Zenaku (821866)

            One base-pair does not a gene make.

            • Re: (Score:3, Insightful)

              by TheMeuge (645043)

              One base-pair does not a gene make.

              But a one base-pair change can unmake the gene pretty well.

              Tons of major debilitating mutations are due to a point mutation.

          • Re: (Score:2, Interesting)

            by amalyn (1405799)
            Systematic errors could be identified by correlating results with other DNA sequencing results.

            Using a large sample, like the proposed Personal Genome Project [unsure if they have gotten in touch with any of those who expressed interest in participating] could be useful in showing any systematic mis-reads, as long as the Personal Genome Project is using another method to sequence the participant's DNA.
      • Re: (Score:2, Informative)

        by prograde (1425683)

        It's common practice in bioinformatics to measure the same data repetitively in an effort to reduce the error.

        It's common practice on Slashdot to read the article before posting. From the abstract of the Science article:

        Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

        So that's 99.3% after averaging 15 reads. Not exactly replicating the same read 15 times..more like taking random starting points and aligning the results where they overlap, so that each base is covered in 15 different reads.

        Don't get me wrong - this is really cool, and a massive speed-up over current "next-gen" sequencing. And I'm sure that it will get better.

        To answer the GP - yes, this i

        • It's common practice on Slashdot to read the article before posting.

          In what alternate universe are you reading slashdot?

          prograde (1425683)

          Oh. You'll figure it out, given time.

      • by timeOday (582209)

        While 0.993 isn't very good, (0.993)^3 is pretty awsome.

        No, 0.993^3 is only 97.9%; how about 1-(1-0.993)^3 :)

      • by sorak (246725)

        It's common practice in bioinformatics to measure the same data repetitively in an effort to reduce the error. While 0.993 isn't very good, (0.993)^3 is pretty awsome. In practice, the errors might be correlated (as in a flaw in the measuring system), so the benefit of re-measuring might not be exponential...however it should be darn close.

        I don't mean to trample your point, but three iterations wouldn't give a (0.993)^3, (which would equal 97.9%). The odds of an error would be 1-(0.007)^3, which would actually be 99.9999657%.

      • by aliquis (678370)

        lol, you got moderated 5 informative but your math is flawed.

        Do it three times to get 97.9% accuracy instead of 99.3? Great! :D

        Guess it should be something like:
        1 - ((1 - 0.993)^3) = 0.999999657 ? =P
        Actually I have no idea, too tired to think about it but not 0.993^3 atleast :)

    • by jbeaupre (752124)

      Given the expense of doing an entire genome, alternative is a 25% accuracy rate. What 99.3% does is let you do a bulk scan looking for interesting areas. Prospecting. Now you can adjust therapies the match likely genome sequences. "Ah Ms. X, I see you likely have gene XYZ. Medication A, B, and C won't work for you so let's try D."

      In other words, you don't need perfect results to now bias the odds in you favor.

    • Re:99.3% accurate? (Score:5, Insightful)

      by ccguy (1116865) on Monday January 05, 2009 @02:37PM (#26332825) Homepage
      Well, depends if those 28/4000 errors are the same in each run or not.

      If they can sequence the whole thing in less than 30 minutes one time with a 0.001% "read" error rate, my guess is that they can get it probabilistically near 100% correct in 2 hours or so.

      By the way, what's the current error rate? Is it 0? (just asking)
      • Re:99.3% accurate? (Score:5, Interesting)

        by Adriax (746043) on Monday January 05, 2009 @03:24PM (#26333599)

        Or you could run a parallel processing setup, 3-5 sequencing chips all given the same sample at the same time. More expensive, but you'd get that effective 100% rate in the half hour time.

        $5k for a genetic sequencer that could give effectively 100% accuracy in half an hour would be pittance for pretty much every hospital in the US.
        Hell, the first malpractice lawsuit it prevents (detect a disorder that would make a commonly used treatment crippling or fatal to the patient) would pay for the machine 1000 times over.

      • Don't know - but it doesn't need to be 0 to be reliable. The genome is quite resistant to random mutation, having been subjected to it for all its existence.

    • by morgan_greywolf (835522) on Monday January 05, 2009 @02:37PM (#26332831) Homepage Journal

      1 Hour Genome Sequencing: 30,000 errors or less or YOUR MONEY BACK!

    • by Stile 65 (722451)

      Do it several times over with different cells and "vote" on the inconsistencies between trials. If 5 out of 7 copies of the DNA look like the base at position X is tyrosine, then it's most likely that it's tyrosine.

    • by msh104 (620136)

      I suppose that running it twice or trice will increase the accuracy a lot more.
      Which makes it still blazing fast.

    • by Hatta (162192)

      When I have sequences done the conventional way, I get less than 1000 base pair reads back. Generally 2 or 3 are ambiguous enough that the machine reads them incorrectly or not at all. 28 out of 4000 is the same as 7 out of 1000, so this is roughly the same magnitude of error. Less accurate than what we use now, but more economical to do really large sequences.

      I don't know how the method works (site is slashdotted anyone got a DOI for the paper?) so it's hard to tell whether repeating the reads would

    • Re: (Score:2, Insightful)

      by Anonymous Coward

      Re: mistakes and inaccuracies...

      You run two or three trials and do "a check sum" ...a la Raid inter leafing...errors stand out and are discarded..

    • That's, what, 28 incorrect base pairs out of 4000? I'm not a biologist, but is this considered an acceptable error rate? Even the hopes of 99.999% accuracy seems really awful when there are about 3 billion base pairs in a human genome.

      That's a very good question, but consider that 100% is impossible. Even the cell's own machinery, under development for millions of years, makes mistakes at a frequency that would be lethal if that's all there was.

      In this case, the error rate seems in the neighborhood of rival techologies. The way to deal with it is the same way the cell uses: redundancy. Sequence segments or the whole thing more than once, the likelyhood of bases in error is significantly decreased. If you run 3 sequencings, there's ev

      • ... and to prove that last point I just realized that I was redundant with the non-coding DNA and introns. I think. No wait, I meant to do that, this way if you misread "introns" it will still be covered by the "non-coding DNA" bit. And that's the last biochemistry joke out of me today.

    • There's an easy and obvious way around this - just run 3 simultaneous instances and error-check by consensus. Still able to run the whole thing in under a half hour and still pretty cheap at ~$3000.

    • The idea is to sequence each portion of the genome many times over. With enough redundancy you can detect these errors, so your final annotated copy would have a much higher accuracy.

    • I suspect, given that this system is fast, massively parallel, and not terribly accurate, that (should it reach production use) its users will end up relying on multiple runs and clever error correction techniques. As long as the .7% errors are random rather than systematic, you should be able to reduce the effective error rate by using "best x out of y", with X and Y chosen for your budget and risk tolerance.

      It probably also will find a niche, in spite of its error rate; because there is just so damn muc
    • by cat_jesus (525334)
      Do the sequencing three times. So you spend 1.5 hours instead of .5 hours. If you're really worried do it 7 times. That should be enough to weed out the errors.
    • So, instead of $1000, you must spend $3000 and get the same result in one hour that we needed 10 years and several millions to get at the 90's... I see, that is useless.

    • You can repeat the analysis several times to increase accuracy, I would think. Do the analysis 3 times and it might increase to 99.9999997%, for example.
    • Re: (Score:2, Insightful)

      by m93 (684512)

      I realize that we aren't going to be trying to make a cloned copy from this data, but what uses is this "good enough" for?

      It's most likely good enough to deny you health coverage. Pre-existing condition? Now risk can be assessed on pre-existing genes.

    • The final estimate is 99.999%, presumably including some redundancy or error checking.

      This is still one error in 100,000 base pairs, which is tens of thousands of errors in the 3 billion pairs of the human genome.

      Of course, we're not talking about computer code here - this code has withstood (or rather adapted to) millions of years of inaccurate copying. Most of the errors would probably hit the vast majority of inactive areas, or be compensated by the natural redundancy mechanisms.

    • Run the data among multiple chips for verification. Run amongst enough chips, the errors should be detectable/correctable.

    • by aled (228417)

      That's, what, 28 incorrect base pairs out of 4000? I'm not a biologist, but is this considered an acceptable error rate?... but what uses is this "good enough" for?

      three-eyed fish?

    • by AmigaMMC (1103025)
      99.3% accuracy might not seem much when looking at 3 billion pairs but remember that a lot of those pairs are dormant or useless (AFAWK).
  • by morgan_greywolf (835522) on Monday January 05, 2009 @02:30PM (#26332713) Homepage Journal

    Sub-$1000 genome sequencing will put the creation of 'designer' kids into the realm of the affordable for much of the middle class. Scary stuff. Now we just need to combine that with cheap and reliable cloning techniques and my plans for world domination will be comlete!

  • Abstract:

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequenc

  • by djupedal (584558) on Monday January 05, 2009 @02:35PM (#26332809)
    > Company founder Stephen Turner estimates that such a chip would be able to sequence an entire human genome in under half an hour to 99.999 per cent accuracy for under $1000.

    I think this qualifies as a true 'technological singularity' [wired.com] :)
    • Shotgun sequencing depends heavily on supercomputer. Thats a thousand-fold every 15 years right there. Multiply that by more intelligent software, understanding of genetics, and sequencing hardware, you may be squaring that rate.
    • by 4D6963 (933028)
      What does genome sequencing have to with Moore's law or the technological singularity?
  • error correction (Score:3, Insightful)

    by bugs2squash (1132591) on Monday January 05, 2009 @02:39PM (#26332859)
    Is there not some form of error-correction in the sequence itself that could be exploited ?

    Something like the error correction on an audio compact disk ?
  • Article in Science (Score:2, Informative)

    by prograde (1425683)

    I assume that the hardware at Science can withstand a slashdotting better than the crappy blog linked in the summary:

    http://www.sciencemag.org/cgi/content/abstract/323/5910/133 [sciencemag.org]

  • 1/2 hour for $1000, eh? And in another 5-10 years we'll cut that in half or more, both time and cost. It looks like the instant gene sequencing tech from GATTACA will be with us in most of our lifetimes. But even with this announced breakthough it'll be functionally the same.

    • Forensic genetic identification currently uses about 60 important genetic markers. Thats good enough to convict in a court law since the the chance of a duplicate may be less than a billion to one depending on marker combination.

      Although humans differ from one another in about 0.1% base pairs for a total of 3 million, the number of difference that describe human variability may be vastly smaller than this. First you discard non-coding DNA which gets you done to 30,000.
      • Re: (Score:3, Insightful)

        by nwf (25607)

        Although humans differ from one another in about 0.1% base pairs for a total of 3 million, the number of difference that describe human variability may be vastly smaller than this. First you discard non-coding DNA which gets you done to 30,000.

        Except that when our differences are so small, the non-coding regions are even more important. They control what genes are active and to what degree. That's nearly as important as the genes themselves.

        Genes are only part of the puzzle. You need to know what to do with them, and non-coding regions provide some of that along with the cellular machinery.

        Scientists used to call them "junk" DNA where junk == "I can't figure it out". Why would cells spend all that energy maintaining something useless? Not very li

  • Since this technique should be a shoe-in for the Archon X Prize [xprize.org].
  • I guess they didn't have the foresight [foresight.org] to use a real host.

    I'll be here all night folks try the steak.

  • I believe the true benefit of this technology will not be for cloning, but for general medicine. For example, you would go to the doctor with a lump, and instead of him doing a biopsy, find cancer, chemo, invasive surgery etc etc, they first take your DNA, sequence it and then take the biopsy and identify the origin of the cancer (is the lump actually metastasized from your pancreas?). Then work on resolving the cure just based on your genetic makeup, rather than a shotgun approach. Additionally medicines f
  • by chihowa (366380) on Monday January 05, 2009 @03:23PM (#26333585)
    It looks to be inaccessible. Here are the abstract [sciencemag.org] and fulltext [sciencemag.org] links.
  • This is just an historical accident; 99.9% was what could be done with what people judged "reasonable" effort and cost a few years ago; unless you know whatyou are going to use the sequence for, you don't know what error rate is acceptable

    There are medical test that rely on dna sequence, eg myriad makes a fortune from sequencing the gene that gives women hereditary breast cancer. I don't know what there claimed error rate would be, but that would show you what is acceptable in todays clinical market place.

    A

  • Grammar ambiguity (Score:3, Interesting)

    by nsayer (86181) * <nsayer AT kfu DOT com> on Monday January 05, 2009 @03:56PM (#26334065) Homepage

    Company founder Stephen Turner estimates that such a chip would be able to sequence an entire human genome in under half an hour to 99.999 per cent accuracy for under $1000

    Does that mean that the chip costs $1000 or that each human genome processed costs $1000?

  • http://www.sciencemag.org/cgi/content/abstract/323/5910/133 [sciencemag.org]

    Abstract:

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, sim

  • by Fnord666 (889225) on Monday January 05, 2009 @04:28PM (#26334497) Journal
    Here [newscientist.com] is an article in New Scientist about the new process. It explains it fairly well and even defines what a ZMW is.
  • Error Correction (Score:2, Insightful)

    by LUH 3418 (1429407)
    Many people seem concerned about the reading error rate. However, as it's been pointed out, it should be easy enough to read a DNA sequence multiple times (or read the whole genome multiple times) to decrease the error rate significantly. If you have one chip that can read the entire human genome in 30 mins, you can have the same chip read it twice in an hour, or four chips reading four copies in 30 mins.

    Furthermore, if you're using a technique like this to map a person's genome, you can be clever about
  • The 454 pyro-sequencer currently produces 400bp reads, not 20bp. Granted, that's still a fair bit shorter than this experimental tech claims, but it's also a commercialized product you can actually buy right now. I think it would only be fair to quote current performance figures.
    • by lbbros (900904)
      We have one of these beasts where I work. It's tremendously expensive to run (~$5K for a single run, although you sequence 40 million bases in 200-400 bp reads), and the most daunting part is the data analysis. So far there are three people that just work on what that sequencer crunches, and it took quite a while, time-wise, to develop efficient workflows.

The best way to avoid responsibility is to say, "I've got responsibilities."

Working...