Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

The 1000 Genomes Project

Posted by kdawson on Wed Jan 23, 2008 03:18 AM
from the reaching-for-statistical-significance dept.
jd writes "An international consortium of specialists in genetics has announced the 1000 Genomes Project, in which at least 1,000 people from around the world will have their genomes fully sequenced as part of an effort to discover the relationship between genetics and disease. At present, over 100 regions of DNA are known to be related to illnesses, but the maps that exist are vague and are drawn from an extremely small population pool. According to the article, this results in the need for slow, expensive, and laborious studies to pinpoint causes, especially for rarer conditions. This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear). The researchers hope to massively speed up the diagnosis of genetically linked illnesses and to improve the reliability of such diagnoses."
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Chinese (Score:2, Informative)

    by Anonymous Coward
    I wonder why there's so much funding coming from China for this project.
    You can see the list of all participants (including funders) here [1000genomes.org].
    • Re: (Score:3, Informative)

      The three countries with groups funding this are The US, China, and the UK, (although there is no indication of the proportion of funding being supplied by which), in each case the funding is being provided by organisations that exist to further science, in some cases specifically genome research. If you look at the other elements of the study you will see that pattern repeated, so I guess it is a case of (in the words of Jim Hacker) "great nations working together to answer the great questions of our age.
  • Selection (Score:5, Insightful)

    by mastershake_phd (1050150) on Wednesday January 23 2008, @03:24AM (#22150358) Homepage
    This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear).
     
    Well, they could sequence the DNA of people known to have rare diseases.
    • Unfortunately, I think their sensitivity to peoples' privacy will prevent them from doing just what you suggested.

      From TFA:

      These people will be anonymous and will not have any medical information collected on them, because the project is developing a basic resource to provide information on genetic variation. The catalog that is developed will be used by researchers in many future studies of people with particular diseases.
    • RTFA (Score:4, Insightful)

      by RML (135014) on Wednesday January 23 2008, @04:51AM (#22150752)
      There are other projects that sequence the DNA of people known to have rare diseases such as cystic fibrosis, and there are projects that sequence the DNA of people with common diseases like heart disease, but we don't know much about the variants in the middle that are neither very common nor very rare. This is an attempt to fill in that gap in our knowledge.
    • Rare Conditons (Score:4, Informative)

      by MassiveForces (991813) on Wednesday January 23 2008, @04:58AM (#22150776)
      Finding diseases that eventuate in 1 in 2000 people with a genomic study of 1000 people is entirely possible... with one thousand people you have two thousand sets of genes. Since most genetic diseases are caused by two of the same recessive alleles (usually resulting from broken genes) in a single haplotype there would be lots of carriers; those with a single disease allele that could be spotted as a major deletion relative to the genomic reference sequence.
  • 1 in 2000 people (Score:5, Informative)

    by rsidd (6328) on Wednesday January 23 2008, @03:29AM (#22150392)

    This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear)

    Let's try to make it clearer, then.

    The probability that a given condition appears in an individual is 1 in 2000, or 0.0005. The probability that it does not appear in that individual is 0.9995. The probability that it does not appear in any of 1000 individuals is 0.9995^1000 = 0.6 approximately; and the probability that at least one of the 1000 individuals has it is 0.4. Not bad at all. (If you used 2000 people, the probability that at least one of them would have it would improve to about 0.6.)

    Suppose you aren't interested in just one conditions, but in lots of conditions -- say, ten of them. The probability that at least one individual would have at least one of those conditions is 1 - 0.9995^(1000*10) = 0.993 == ie, practically certain.

    They really ought to teach basic probability theory in schools...

    • by nacturation (646836) <nacturation AT gmail DOT com> on Wednesday January 23 2008, @03:57AM (#22150562) Journal

      This project aims to find conditions that might only appear once in every 2,000 people (though how they intend to do that with half that number is unclear)

      Let's try to make it clearer, then.

      The probability that a given condition appears in an individual is 1 in 2000, or 0.0005. The probability that it does not appear in that individual is 0.9995. The probability that it does not appear in any of 1000 individuals is 0.9995^1000 = 0.6 approximately; and the probability that at least one of the 1000 individuals has it is 0.4. Not bad at all. (If you used 2000 people, the probability that at least one of them would have it would improve to about 0.6.)

      Suppose you aren't interested in just one conditions, but in lots of conditions -- say, ten of them. The probability that at least one individual would have at least one of those conditions is 1 - 0.9995^(1000*10) = 0.993 == ie, practically certain.

      They really ought to teach basic probability theory in schools...

      Your post is like that scene in Indiana Jones with the guy making some really impressive sword moves, looking all menacing... while Indy just pulls out his revolver and shoots him. You could get a whole room full of geeks cranking numbers and arguing over how many people they would need to find in order to exceed a particular probability that any one participant has Lou Gehrig's disease, while a simpler person would leave the room, come back the next day, and say "Hey guys, meet my neighbor Bob... he has Lou Gehrig's."
       
        • Re: (Score:3, Funny)

          Or, in other words, Indy's companions are always arguing and therefore geeks.
    • Re: (Score:3, Insightful)

      they really ought to teach basic genetics in schools.

      you neglect the fact that each person has two sets of genes, one inherited from their mother, the other from their father. that brings the total number of genes to 2000 sets. and it's also likely they're interested in many more than ten conditions. so you should think more in terms of a probability density function of conditions found versus their rarity.

      • I thought this too (two sets of genes) - but its useless if they find a gene for a rare disease in a person if its not expressed (and hence not detected by the researchers). Hence having two sets of genes does nothing but complicate things further (as they now have to find which particular gene out of the two is the one causing the problem).

        Furthermore another issue is that the genome is one huge causality network - for all but the most simplest disorders you'll need to have a cascade of genes to get a p
        • well, that's what computers are for, sifting haystacks. and surely they're interested in far more than just rare diseases. most all of us end up taking a handful of pills by the time we're 65. cardiovascular disease, cancers, and dementia are where the money's at.
            • Re: (Score:3, Interesting)

              I'm not convinced it's coincidental that Google's research space was announced shortly before this project. I suspect Google is/was thinking along very similar lines. BLAST may be adequate for many things, but GoogleBLAST would be about what it would take to crunch any significant collection of entire human genomes.
    • It's sort of right. Usually the phenotype will be recessive - so two bad copies need exist for the condition to be seen but only one bad copy needs to exist for it to be a useful sequence. For example, although the frequency of cystic fibrosis in Caucasians is 1/400, but the allele frequency is 1/20. So you need to look at the square root which gives you much higher probability of a hit. (BTW, the frequency in Asians is I believe on the order of 1/500,000 so CF could be cured simply by outbreeding - and no
    • Also, don't forget that each person has two haplotypes, one from each parent, so
      when one sequences a person, one captures the variation on two human genomes at once.

      Of course, this all relies on the coverage you sequence at, and one option for
      the 1,000 genomes project is doing this at low (2x?) coverage, using pretty sophisticated
      methods to combine statistical power between sample datasets.

      The "1,000" though is more a round number that is in the right range. it might well be
      1346 people or something like th
    • They really ought to teach basic probability theory in schools...

      Or maybe basic biology maybe? The Hardy-Weinberg equation plus a little basic algebra solves the problem:

      p + q = 1

      p^2 + 2pq + q^2 = 1

      P and q are the frequency of a specific gene (assuming there are only two variants, but lets KISS.) Each organism has two copies of a given gene. They can be pp, pq, or qq. So the number of p genes and q genes must equal 100%. And the number of people who are pp, qp, or qq must equal 100%, hence the two equations.

      In the case of a simple autosomal recessive gene, t

  • by Aereus (1042228) on Wednesday January 23 2008, @03:34AM (#22150426)
    I have no idea what they plan to do with 1000 gnomes, but I can only guess that whatever it is will end in a giant explosion.
  • This is really awesome. For too long, the "human genome" has been what we know of a few guys who ran the HGP. Since then, many more have been sequenced but not systematically, and not for the sole purpose of cataloging the countless variations present. This sort of database is the first giant leap towards effectively creating a solid understanding of human variation, allowing us to perfect everything from medical treatment to diet supplements (the GATTACA option in the poll is so relevant). Really, this
  • by Biotech9 (704202) on Wednesday January 23 2008, @04:09AM (#22150600) Homepage
    Anyone reading up on the progress in genomics over the last decade has seen the huge leaps in speed and accuracy and the insane cuts in cost to work with nucleic acids.

    From a lab level where what used to be a weeks work with lots of chemicals and processing is now usually a 20 minute protocol with a kit from Quagen. what used to be massive amounts of work with hundreds of gels and digestions and labeling steps to analyse nucleic acid sequences is now a few days with an affymetrix kit, giving far more accurate and useable results. Across every step this progress has been rapid.

    And in the future, near-term like within a decade, all these methods will become outdated and replaced with near-realtime analysis and diagnosis. The best point in all of this is that no matter how advanced medical tech has become, the limiting factor has been that it's necessary to actually BRING your disease ridden body to the hospital or doctor. The rise of companies like www.decodeme.com [decodeme.com] is what i expect DNA assesment to be like in the future. You send off some samples you scrape off your cheek yourself, and within a few days you get a full diagnosis on any known predisposition to disease or genetic problems.

    Which is why a lot more attention should be put into the debate on morality and genetic profiling. It's going to be here before you can blink, it might be nice to know what you think about using embryo selection to wipe out CF before it becomes a possibility.
  • I've got marfan syndrome. I am really eager to have my genome sampled so that this condition is better understood.

    www.marfan.org
  • Not 1 in 2000 (Score:4, Informative)

    by RML (135014) on Wednesday January 23 2008, @04:35AM (#22150678)

    The scientific goals of the 1000 Genomes Project are to produce a catalog of variants that are present at 1 percent or greater frequency in the human population across most of the genome, and down to 0.5 percent or lower within genes.
    A frequency of 0.5% is 1 in 200, not 1 in 2000. That's much easier to find with a thousand genomes, especially since they're not trying to figure out what the variations do, just that they exist.
    • Does an individuals DNA structure change at all through out ones life time?

      Not in the sense you probably mean: your DNA does not adapt or "change" during your lifetime. Some cells have some changes to their DNA, either by accident or on purpose, but that generally amounts to inactivating or removing genetic material that a specialized cell won't be needing anymore before its death.
    • Some individual cells' DNA may change (usually for the worse - that's how you get cancer), but those mutations are probably much less common than the errors generated by the DNA sequencing machines. Both sorts of errors are filtered out because each piece of the genome is sequenced several times for each individual, and a computer combines the results.