Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Science

Celera Completes Human Genome. Sorta. 126

kovacsp was the first to write to us about the announcement from Celera that they had completed mapping of the human genome. Note: This is /not/ the be-all, end-all. They have finished *mapping* one person's genes. With Celera's approach, this means that they now need to being assembling the information they've gathered. All in all, Celera plans to do the same process with four other people. The Human Genome Project, using a more traditional approach is still a couple years away, but the race is still pretty close.
This discussion has been archived. No new comments can be posted.

Celera Completes Human Genome. Sorta.

Comments Filter:
  • The real concern is privacy. Sooner or later we'll start seeing huge gov't or corporate databases of people's DNA. (e.g., Military (or even DMV) requires DNA sample for potential later ID use (only under court orders, or course. Yah right. And SSN is "Not for ID purposes". Not anymore.) Or DNA sample required on healthcare application. Maybe healthcare providers will decide to deny you coverage or charge you lots more because your DNA shows you to be at higher risk for [condition].

    Or new laws (like current blood test laws) will now check DNA and say you can't marry [SO] because your children will be at risk for [condition].

    Where does it end?

  • by Anonymous Coward
    Looks like that hell-spawned Clinton better make his move soon, the clock's ticking on his reign as "God?". Or is Al Gore, father of the internet, truly the antichrist? Isn't it sad what living under power lines can do to a person?
  • I am not a biologist...yet (bio undergrad though)

    The problem is that we ALL have recessive detrimental genes. It has been shown experimentally (see Dobzhansky's work with D. melanogaster) that it only takes a relatively small amount of inbreeding to significantly raise the chances of offspring inheriting lethal genotypes and have severe viability problems. That's why inbreeding is so avoided in much of the animal kingdom (well, that's not true either, as many species keep SOME level of inbreeding to ensure the inheritance of evolved gene "blocks" of genes what work well when inherited & expressed together, but for most mammals significant inbreeding is avoided)

    Sincerely,
    Kevin Christie
    kwchri@wm.edu
  • by Anonymous Coward
    Yep. Celera's approach is somewhat akin to the following:
    1) Take a 5 copies of a HUGE book.
    2) cut out random sections of 500 words each.
    3) thoroughly mix the pieces.
    3b) Lose several of the pieces.
    4) add mistakes to the pieces
    5) Attempt to reassemble to book based on the overlap between the random pieces of 500 words each.

    Notice that the missing pieces and mistakes will make it quite difficult to reassemble to book, even though you have redundant pieces.

    What I described above is also a far cry from the billions of base pairs in the human DNA "book".

    The exact details are above, of course. Celera is slick, but they are just boasting at the moment.

    another good link:

    BBC report on Bill & Tony requesting the human genome be in the public domain:
    http://news2.thls.bbc.co.uk/hi/english/sci/tech/ newsid%5F677000/677815.stm

  • Celera will provide data from its own databases, via tools it sells for the job.

    Other companies can then patent particular sequences as they apply to a particular discovery. For instance, Incyte could patent the knowledge that a given sequence causes colon cancer, or AIDS.

    Nobody can (yet) patent a sequence itself - nobody is even lobbying for this.

    Incyte is, OTOH, still going ahead with their own sequencing of the human genome, AFAIK.

  • Celera may be able to circumvent that clause temporarily.

    You (as so many others in this forum) misunderstand the business model of Celera - they are not there to patent stuff. They are not even there to "own" information, or hold on to it.

    Celera was created by my own company, PE Biosystems, in order to sequence the human genome (using our 3700 DNA Analyzer), and then sell the tools that people could access this data with.

    -tor

  • I'm responding to something marked "Funny", but actually there are real applications for DNA compression. For example, exons are less compressable than introns. This could have applications in gene finding (Sequencing genomes just gives you a series of bases, gene finding programs help interpret the sequence by finding areas that encode proteins)
  • According to a friend of mine at the Whitehead Institute at MIT, which is part of the HGP, Celera has sequenced the data from one person, not assembled it. To have all the data sequenced is the equivalent of having every character from the source files for the linux kernel, but not knowing what order they are in.

    HGP as of mid March had completely sequenced and assembled 2/3 of the genome, based on 6 individuals. Current amount complete is closer to 75%.

    This really trashes Celera's business plan, which as far as I know is to sell the sequenced data to companies like Amgen for them to assemble and profit from. At least 2/3 of the information is already available gratis, already assembled.

    Of course, this also makes Celera's work easier, since it can use the 2/3 of assembled data as a backbone against which to assemble the remaining third. But by the time they finish that, how much more will the HGP have published.

  • Wouldn't that cease and desist letter come from the Law Firm of Mephisto, Asmodeus, and Beelzebub instead? :^)
  • > Your dad could be Michael Jordan, but if all you do is sit at your computer and eat junk food you won't make the NBA

    No, but if you were a clone of Michael Jordan, it's unlikely you would be inclined to sit at your computer and eat junk food all day. 'course a butterfly flapping its wings in zimbabwe might cause a tree to fall on you sometime, leaving you in a wheelchair, unable to get in the NBA. pesky butterflies, always causing disasters ;)

  • Doubt it. Couples nowadays don't (usually) give birth to identical children, despite the chromosomes of the mom and dad remaining unchanged. Of course, there is always The Milkman theory of species diversification, but I tend to think that every sperm or egg contain unique genetic information even within a single individual (or however that should best be said, you know what I mean...)
  • 90% percent of our genes are similar,
    probably because we shared 90% of our evolutionary
    history (first 3.5 of 4 billion years).
    Most of these similarities are basic proteins
    all land animals share in their metabolism.

  • Celera is using five, while the NIH/DOE project
    is using ten. 995 out of a 1000 base pairs
    will be the same between humans. The fifteen
    humans will provide error-cross checking,
    plus the locations where humans vary.

  • The published genetic codes are at
    <A HREF=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi ?db=Genome> this government site </A>.
    Submission of data to this site is REQUIRED
    before a scientific journal will publish an
    article about your results.
    However, these may not appear on the web site
    until the day of publication (e.g. the fly genome in March 24, 2000 Science).
    So I wouldn't look for the Full Human until
    sometime in year 2001 here. (The shortest chromosome #22 is all there already.)

  • Human relatives differ by 0.1% or less.
    Humans different by 0.5% among themelves.
    Chimps differ from humans 2.0%.
    Fruit flies differ by 10-15%.

    Most of the human cancer gene defects have been
    found in fruit flies according to March 24 Science article.
  • In the Fly sequencing description, they split
    each chromosome into three different size pieces-
    about 2K, 10K and 100K kilobases.
    Then they do mutual comparisons of ends to find
    overlaps, sort of like a "super grep".
    The pairwise comparisons runs into the trillions,
    hence supercomputing.
    Matching "junk DNA" pieces is difficult,
    because junk DNA tends to be very repetative.
    They've been getting a 98% match rate.
    These assembly tricks are described in detail
    in the March 24, 2000 issue of Science.
    The human genome is 15 times larger than the fly genome.
    This ethod has been tested on several smaller
    organisms with surprising success.

  • But weren't as resourceful as the private section.
    They were planning for take 15 years and $3 billion.
    Mr. Ventor figured out how to do it in three years
    for 1/10th the price.

  • I read somewhere last year about President Clinton
    donating DNA ....

    :-) :-) :-)

  • by zCyl ( 14362 )
    This story has been up for a few hours, and nobody has said anything stupid about processing the genome on distributed.net yet. :)
  • This brings up the whole issue of patenting and the Intellectual property ownership of part/all of the human genome.

    Patents are for inventions, not discoveries. If you patent something that you didn't really invent (e.g. a genome, even the one that you happen to use) and try to enforce it, you should lose.

    Hmm.. but can a pair of parents copyright their child's genome? It's most likely a unique expression, and an argument could be made that they did really create it, even if they had no conscious control over the details. Hm.

    Ah, but it's not a completely new expression; it's a derivative work of the gnomes created by their parents, which is a derivative work of...

    Contacting the all the original owners of the work (most of whom are not even human or alive) in order to purchase all the contributors' rights, would be extremely difficult. You can't just assume that you have inherited the rights from the dead ones either, since they have other descendants besides you. Orthodox evolutionary theory states that all life forms on Earth are descended from a single living organism (which is probably dead by now), so they all would have a claim to the inheritance.

    Every living being on the planet is a part owner of every genome on the planet. That's about as close to public domain as you can get. So no, I'm not worried about anyone patenting or copyrighting my genome. Any living thing's genome is a derivative work of something that everyone owns the rights to.


    ---
  • >5 other people have yet to be sequenced, not 4. I should know. I'm one of em.

    (chuckle)

    Bowie J. Poag
    Project Founder, PROPAGANDA For Linux (http://metalab.unc.edu/propaganda [unc.edu])
  • The 500 MHz G4 is capable of sustaining One Billion Floating-point Operations per Second - one gigaflops. Its peak performance is claimed to be around 8 gigaflops. A "megaflop", it stands to reason, is either a venture which failed miserably, or just a guy with a big, limp penis.

    The point is: if you're going to try to use facts to support half-baked analyses, at least get the facts straight.

  • Yo Chris - thanks for the tag. I always feel that
    the signal to noise discussions on slashdot
    are pretty skewed. Who knows how this all going to
    pan out.

    I have to admit I think we have done pretty
    well with the latest bioperl. Kudos for you
    as well chris...
  • That's what Celera said this morning. No month but "by the end of this year."
  • I was at the hearing today at the House of Reps this morning on this topic and Dr. Venter of Calera said they use 64 bit alphas. No specifics though.
  • Ewan Birney will probably chime in shortly :)

    Ewan is heading up just such an open source project you mention. Check out www.ensembl.org [ensembl.org].

    In a more general way we are also working on tools over at bio.perl.org [perl.org]

  • Very Well written Sean !!!
    See you in Helsingör
  • And on a slightly different note, if the offspring were a Boy, what are the chances that he will be a clone of you? :)


  • So, just how much patenting can Celera really do with this shotgun data? Especially if the public folks are looking at their half of the shotgun data, making inferences about function, and putting it in the public domain in parallel?

    The big worry seems to be that they could be a tollkeeper for a large fraction of all genomic research via this patent route -- but its unclear from the press reports just how realistic this fear is ...

  • uh ... it's April 2000, aren't you supposed to be in a bomb shelter somewhere?
  • I recently was at a presentation by a human genome project researcher (doing C14 work) given to us geek types so we would better appreciate the gigabytes of info they're accumulating on the servers at alarming rates.

    Anyhow, the scientist indicated that genome variation between individuals is something like 1 pair in 1000; e.g. down at the genomic level we really aren't that different.

    But what a difference that 1 in 1000 makes on the final product!

    Also, at least here [washington.edu] (where I work), the genes are donated anonymously, so there's no way (yet) to attatch a particular sample to a particular person or ethnic group.

  • One one level, I figure its makes sense that a private company would be able to beat some government initiative, no matter how well staffed and funded.

    On another level, I'm worried too, and I can see them trying to patent certain things because they got there first, rather than (I don't have a major problem with) them patenting their techniques for acquiring this intel.

    Should there be a 'GPLing' of the genome? If so, just how could you?

    -pbk
  • I get it.

    Its one of those situations where waiting to start something can make the result come sooner. Of course, the success of the late starter (with the newer technology, like Celera) is predicated on the innovations of the first mover (like HGP.)

    In this scenario, you can't have a Celera without an HGP.

    Question: Are they working on these 5 people in parallel? Wouldn't that make sense?

    -pbk
  • I think that the Human Genome is a prior work. If anyone holds a patent, it would be God, but she forgot to file it.

  • Well, they do!


    FunOne
  • Given the recent appeals court ruling on programming code, it only seems reasonable.

    I know that Celera is patenting the genes it finds by the thousands, but won't I be able to use those genes as a way of expressing myself?

    For example, I could use them to present a specific eye or hair-color, or a specific pattern of baldness. Would my rights to my own genetic code be protected now?

  • Actually the first verse and chorus of Clone of My Own [gills.net] were originally written for FSF [sfsite.com] by Randall Garrett [sfsite.com]. Isaac Asimov, Robert Silverberg, and several others did contribute a number of verses; I don't thing RAH was one of these others, but I could be mistaken.

    Rev Neh
  • What if Newton had patented gravity?


    Grades, Social Life, Sleep....Pick Two.

  • I'm not a biologist, but aren't everybody genes different?

    To insure that the map is correct wont they need to look at more that 4 other people?
  • The government classified anything that could perform a megaflop as a supercomputer. With the AltiVec engine, the G4 was capable of a megaflop, and thus classified a supercomputer. Apple just took advantage of an aging definition of what a supercomputer is and made the ads.
  • The individual sperms do not have to contain different chromosomes. The differentiation occurs in the way in which the 23 chromosomes from the dad combine with the 23 chromosomes of the mom. The number of combinations, given the number of genes is horrendous which is why the chances of somebody giving birth to identical kids is negligible unless they are monozygotic twins. In which case they have the same genome. Farhat
  • The fact that these two organizations are "neck and neck" should be a giant flag waved in the face of any patent examiners looking at gene patents. Part of getting a patent is that the new "invention" should be non-obvious to a practitioner knowledgeable in the field. It seems that what Celera is doing is rather obvious to a practitioner in the field, and there not patentable. Patents aren't a gold rush. It's not supposed to be about who gets there first; it's supposed to be about innovation. If I manage to count the five thousand fenceposts surrounding a farm, it doesn't mean that I should get a patent on fenceposts. Anybody can count fenceposts.
  • Heinlein remains one of the sickest bastards in american literature. -D
  • All of Celeras research will be at an ENORMOUS cost to the company. Should they make all the info free?

    Craig Venter has promised this (on June 17, 1998 to the U.S. Congress [house.gov]):

    A fact that has often been overlooked or questioned in the press accounts of this venture is that an essential feature of the new company's business plan is to provide public availability of the sequence data.
    [snip]
    It is our plan to release data into the public domain at least every 3 months including the complete human genome sequence at the end of the project.

    Celera has never made a release of it's human sequence data, so they must have started their work less than three months ago. The media [bbc.co.uk] are obviously mistaken when they report that Celera started sequencing in September last year.

    BTW the human genome project got a bit of support yesterday from the U.S. Senate [loc.gov],

  • Bioethics are really going to have to race to keep up with research like this.

    So, say you could change something cosmetic about yourself genetically for a reasonable price. For example, what if a virus were available that triggered a whole-body genetic mutation, and the end result was a change in your genetic hair color?

    Assuming you wanted that hair color, would you do it? Would the use of genetics for such things be unethical? How sacred is that individual genome that each one of us posseses?

    What are the ethical uses of genetic information and modification? To cure disease? To select offspring attributes? There are LOTS of interesting questions that are going to be coming up in the next decade or two.

  • Get over it.
    I'm a worried a WHOLE lot less that a private company has this than I would be if my Government had it!
  • Comment removed based on user account deletion
  • Comment removed based on user account deletion
  • Nobody can (yet) patent a sequence itself - nobody is even lobbying for this. Actually, companies like Incyte and Human Genome Sciences have been submitting patents for years on what amounts to little more than raw or nearly-raw mRNA sequences. SOme of them have even been awarded. F'r instance, Human Genome Sciences patented the mRNA sequence of a human gene that they /suspected/ of being a cell-surface receptor based on little more than some computer models... heck, they didn't even get the sequence 100% correct. Several years later, groups working on HIV discovered this gene completely independently and named it CCR5... only after they figured that it is a key coreceptor involved in the entry of the HIV virus into a T-cell. Who do /you/ think the patent should be awarded to, the researchers who figured out specifically what the gene does in the context of a T-cell and HIV infection, or the sequencer jocks who knew almost nothing about it & couldn't even be bothered to get their sequence right? Patents based on sequence data alone are obscene, especially given the relative ease with which any decently funded biotech company, using standard techniques, can study a gene of interest. I hope this one goes to court, and I hope Incyte gets sent home with its tail between its legs.
  • Sorry, that's a pile of crap -- one of the Big Lies that Celera is telling the public -- and I'll tell you why.

    If I told you Gateway was a more efficient PC manufacturer than IBM because I took their total budgets and divided by the number of PCs they sell, you'd tell me I was being simplistic and stupid; IBM has a number of other corporate focuses.

    The public genome project involves much more than raw human sequencing. It also funds technology development, physical mapping, genetic mapping, and model organism genome sequencing, amongst other things.

    Likewise, Celera is being disingenuous in comparing budgets and timelines. In actuality, we are all using the same basic strategy and the same equipment, so the rate and cost determining factors are identical.

    Celera intends to reduce their costs in two simple ways.

    First, half their data will be taken from the public domain. Automated scripts from our friends at Celera download data nightly from our anonymous FTP server, a source of great continuing amusement to us, considering the corporate press releases that boldly say that "Celera has never relied on any public resources".

    Second, they will not attempt to finish the genome to the high quality that we are aiming for, and it is that high-quality finishing stage that consumes expensive labor.

    The combination of these might reduce their costs to about 0.10/base, so they could get away at $300M for the genome, compared to our $1000M. There is no way they can get under 1/10 of our costs; they've already spent ~$200 million or so just in one year of salaries and capital costs, and I have no idea what their supply costs are (but our *major* expense here is supply costs). Their slightly greater speed comes at a substantially greater incremental cost. Don't bullshit people about what a small efficient company they are: they are a big-ass biotech company with about a $6 billion dollar market cap.

    And if their business model excites you, and convinces you that Celera is so cool, hey, here's some insider info: this November I plan to start Sean Genomics, Inc., and I will sequence the human genome in 1 day for $0, by downloading the data by FTP from WashU and Sanger, and I'll start issuing my own press releases about how I'm a zillion times more efficient that the Human Genome Project. Watch for my IPO!

  • Please do this soon. And put it on a live webcam. Most of us want to watch.

    Just another dead troll in a Baggie(tm).

  • No - since we have 2 copies of each chromosone the child could get the a gene from fx the left of both parents. So the child would have a different chromosone pair as a result (actually the genes in the chromosones are mixed too IIRC) And having two identical copies of a gene causes many of the problems of inbreeding if I understand correctly...
  • I remember browsing through Gutenberg.net [gutenberg.net] and coming across a partial sequence of the human genome (correct me if I got that wrong). Check it out! "Chromosome 01" clear up through "Chromosome Y number 24". Utterly useless to me, but still pretty cool! (Note: I think I recall it being just ascii text... so using 8 bits instead of 2.. doh!)

    Anyhoo. Just thought maybe people (particularly those ranting about us not having any info available to the public). Hopefully.

  • So, say you could change something cosmetic about yourself genetically for a reasonable price. For example, what if a virus were available that triggered a whole-body genetic mutation, and the end result was a change in your genetic hair color?

    So this raises an interesting question that I've wanted an answer to for some time. [side note: I am not a biologist, nor do I play one on the Net, so please excuse me if this is a dumb question].

    One of the much touted advantages of genetic engineering is the ability to cure genetic problems in living humans. This is distinct from altering the genetic code in cells that will go on to form a viable human foetus.

    So, say I have some genetic disease caused by an unfortunate sequence in my DNA. Assume we know what replacement sequence would cure this problem. On an engineering level, how would I go about making the change in every cell in my body? This is what I would have to do, right? Is this an area where nanotechnology and genetic engineering meet? Or could genetically-modified viri really perform this task?

    I assume that now we are closing in on getting detailed genetic information about humans, people are starting to think about how gene therapy might be applied in practice. Does anyone have anything they can share with us on this subject?

    It is not a dumb question by any sense of the word.

    A genetic disease can have two major effects.

    • Over production of a protein that is harmful
    • Under production of a protein (or production of a flawed protein) that is needed (eg haemophilia, which is a defect in blood factor VIII or factor IX)
      • The first is very hard to treat as you do have to hit every cell in the body.
      • The second is much easier adn fortunately much more the common scenario. You do not need to correct every cell, just persuade enough cells to produce the right protein (usually by putting in some extra DNA that has the correct coding sequence) that the balance is restored.

        Interestingly the same technique can be used to immunise people. Instead of injecting protein, inject DNA that then produces teh protein where it can raise a bigger immune response.

        Does it work?

        Unfortunately the technology is nottrivial and the understanding of how this works is far from complete which is a scientists way of saying yes and no.

        There has been limited success in some trials though humans don't seem to respond as well as the animal models.

        I should mention that I am somewhat of an outsider in this field, I am not as up to date as once I was so a more up to date biologist can probably give you a better answer.

        ..d

  • Ewan is one of the outstanding people in the field

    I was thinking more along the lines of EMBOSS [sanger.ac.uk] as well. I have been hacking a few bits on that and trying to get things going. Generally people like it a lot despite the command line.

    Bioperl is cool. I put together a database indexing and retrieval script for my DBs with a non standard header. Works a charm (after tidying up a few wrinkles in the way my hack worked.)

    We are looking to get ENSEMBL up and running shortly. It looks exciting and we want to build on it for our own purposes too.

    Academic genomics has certainly produced a lot of good open source tools. ENSEMBL, EMBOSS, Bioperl, Biojava, Sean's HMMER and so on. Maybe we should start a new site? adopt_a_genome_scientist.org for those that can write good code but don't know what to write to meet those that know what they want doing but can't write code.

    ..d

  • The way to succeed in biological sciences is to study what the media is interested in and not necessarily what's practical. Gene sequencing today has the media attention yet the foundation of gene sequencing, bioinformatics and protein modeling is impossible to get grants in. Since the DNA sequence is useless unless you can process the data we see how important media coverage is in the world of biology even though none of the data can be used for anything.
  • Inbreeding is only an issue if there are bad recessive genes. Heinlein himself addressed this at the end of _Time Enough for Love_, where Lazarus Long was finally jumped by his twin female clones.

    Now THAT is what i call a rich fantasy life. :}

    __
    (oO)
    /||\
  • According to this week's Time Magazine, the Human Genome Project is now estimating that they will finish in November of this year, unless I grossly misunderstood the article.
  • > It seems counterintuitive, but it's not.

    > Wake up and smell the coffee.

    I think your tin foil cap is slipping. Better adjust it - FEMA is trying to take control of your mind so that they can use your genes to create a race of superhuman zombie soldiers.

    It seems unlikely, but it isn't.

  • However, Venter also defended the company's plan to patent up to 500 genes.

    This is what the outcry is over. It's also why their (and many other 'biotech' firms') stock evaluation is soaring through the roof.

    After the HGP had ran for a few years a group splintered off to pursue it purely for profit. That's Celera. They've promised to allow access to the information for researchers, but have never deatiled to what extent. They obviously aren't going to allow outside research to be done with any genes they claim a patent to, and that's been the sticking point in most of the past cooperation talks.

    Also remember that the announcement made by Clinton & Blair a few months ago gauranteeing the freedom of the genome only applied to the HGP. They would have had to have completed the first map in order for it to mean anything. Celera's rather amazing accomplishment now will mean a real big headache for everyone who's not one of their investors.

    A possible scenario is that you fully develop a way to clone yourself, but can't because certain genes giving you immunity to some diseases are protected by a patent. It's really horrifying. Medicine is about to get as nasty as computer software. Everyone is going to sue everyone over everything, all trying to get any slight advantage they can. And people will be dying so stock prices can raise a few tenths of a point...

  • The most interesting article in the March 24, 2000 Science issue that described the recent
    fly sequencing is comparative protein complexity.
    Once you have the genome, you can start deducing better the mix of proteins in organisms. Proteins do most of work of life and are harder to analyze than DNA. Only a few percent in humans are understood.

    These three organisms: worm, fly and yeast
    were the first three complex organisms to be
    fully sequenced. (Mouse, human, dog, corn, rice, and tobacco are in the works.)
    It turns out that the worm is slightly more complex
    than the fly, and both are about twice as complex
    as yeast. It is expected humans will come in
    about twice as complex as a worm. We'll know in
    a few months.

    Protein complexity is not necessarily the same
    thing as organism complexity.
    All organisms on earth have been evolving for
    four billion years, so have the same chance
    at complexity.
    Genetic mechanisms for managing complexity have
    been evolving too.

    It may be humbling to find that from a genetic
    measure, humans are simpler than some other
    plants and animals.


  • "And today's big story...

    "New Virus Alert - and this time it's not computers.

    "After a legal misunderstanding over a copyright notice, Celera Inc. released a virus that destroys all copies of the human genome.

    [Shot of embarrassed-looking laboratory tech]
    Well, we've been pretty busy lately, what
    with being right on the verge of being the first to sequence the human genome.
    [Shot of blackboard with word GATTACA on it]
    It's a lot shorter than I thought it would
    be -must be all the repetitive sequences-
    but we double checked it with our new
    supercomputer, and management says it's
    most powerful private comuter in the world
    [Shot of Amiga 4000]
    We were so busy that I haven't even had
    time to read Slashdot in about a year.
    [turns to colleague]
    Hey, what's this about 'grits'? Is it some
    new distribution or something. Sounds Hot
    [Colleague
    yes... hot... grits... (falls over dead)

    [Reporter looking worried]
    [Cut to Anchor]

    We interviewed Celera's last survivors,
    a management team that decided to celebrate
    their victory by going to Disney world

    [Cut to man in mickey hat]

    Well, the Chief scientist called and said
    we were infringing on some copyright or
    something, and that we had to destroy all
    our human DNA.

    So I start getting suspicious, and ask him
    if he had any hackers or geeks in his lab.
    And the man ouright admits it! No spin, No
    apologies, nothing.

    So I knew we were in bad shape -- I mean
    geeks and hackers, you KNOW they have to
    be in the wrong.

    I told him to destroy it all. All human DNA
    every speck in the entire building. I told
    him if I found even a single base pair when
    I got back, he could kiss his stock options
    goodbye.

    He acted like I was crazy. He refused to
    do it. Fortunately, with the advanced
    voice capability of our new computer...
    [shot of Amiga 4000, now labelled "WOPR"]
    the Work Oriented Peptide Resequencer,
    I was able to give the command directly.
    (looks at reporter conspiratorially) You
    know, I think our chief Scientist was one
    of *them*, you know, I mean a hacker geek.

    [Cut to anchor]
    The military says the first containment
    cordon at fifteen miles was breached, but
    the think they can contain the virus at a
    radius of 25 miles if they launch a nuclear
    strike to eliminate all wildlife in the
    zone. Animals, it seems, are unaffected by
    the virus, but provide a vector to cross
    barricades
    [soldier: I tot I taw a puddy tat! (opens fire)]

    __________

  • 1) It depends on the condition...

    For example, for diabetes, it might be enough to get the gene (under a proper control sequence)
    into a fraction of you pancreas B-cells.

    For other diseases, like phenylketuria (the inability to process certain amino acids like phenylalanine, it might be enough to get the enzyme gene (suitably activated) into a relatively small number of cells, anywhere in the body. Here the goal is just to break down enough of the a.a. by a harmless pathway to keep the toxicity down.

    Sometimes, you don't even need to change the cells in the body. For example, a permeable container of genetically engineered cells implanted in the body would work for some diseases

    2) It isn't going to be easy. A test subject for a genetic modification died last month of an unexplainable liver failure, being exposed to a usually harmless virus, loaded with a human gene. the other test subjects were fine. No one knows why.

    __________

  • I am not a molecular biologist now, but I was.

    You have the intron argument exactly backwards When you read those statistics about 98% similarity, it includes the total genome (introns, exons, non-coding 'junk DNA', telomeric tails and other repeating sequences).

    How could we compile a 98% index of similarity in introns? We don't even know a full 98% of the human genes yet, much less their introns? Much less the monkey genes/introns to compare them with?

    Even after we have the genome sequenced, it will be many years before we find all the sequences that act as genes, much less the methods of their processing and expression (like introns)

    These much-bandied numbers came (years ago) from random sampling techniques, and the sequences of (then) known genes. Predictably, we sequence the important and easily located genes first. These numbers are inaccurate, and should be shot on sight, because, as I will explain, there will never be a single accurate meaningful percentage number for "how much like the chimps are we." Never.

    Important proteins are usually more highly conserved (don't change much) because changing them is often life-threatening. Most changes adversely impact the organism. [Histones, for example, are so highly concerved that they are only a few base pair differentin man, cow, and pea. Such conservation is rare, however]

    99.9999999 the same? Give me a break. If that were true, human individuals would only vary by four base pairs on average. Watch your numbers, willya?

    In fact (for reasons I will cite below), any two random cells in your own body are probably not 99.99999999% identical

    So how much is the difference betwee humans?

    There are roughly 10,000 genes in the human genome (an estimate widely used in the field). Since I can name, off the top of my head, a few dozen common variable allelles (e.g. AB blood type, minor blood types, eye color, etc) I'd be very surprised if there weren't hundreds of less known common variable allelles (100/10,000 =1%) So I doubt most humans are 98% identical on an ALLELLE level and 95% may be pushing it (ALLELLES are 'different gene forms' like blue vs brown eyes, or Rh+ vs Rh-)

    But you're talking on a BASE PAIR level, and that's purely a philosophical question, not a matter of strict numbers as you suggest If you drop a single base pair, all the subsequent amino acids will be TOTALLY different (this is called a frame-shift mutation, and in fact the gene will usually become nonfunctional because an accidental 'stop codon [3 of the 64 codons are stop codons] will likely be created with a short distance of the change)

    One could argue that this is a a one base pair change in the gene, but it wipes the gene out entirely.

    Another type of mutation is "conversion" where an A becomes a C, etc. You almost certainly carry thousands of base pair conversions compared to your ancestors, but they have little or no effect on your genes, their products, ot the effectiveness of the function of the protein functions

    And how do you count transversion? If a big chunk of a monkey liver enzyme gene is now used in a human brain gene? Is that a match or not? Or if the entire monkey enzyme is now never used in the human liver, but only in the human brain, is that a match? Or what if an enzyme splits into two forms that are used in different tissues and are very similar, and perhaps sometimes even combined (e.g. creatine kinase)? What's the frequency, Kenneth?

    Therefore, counting random base pair homology (similarity) is an irrelevant exercise in today's science. If we need to count (and why would we do with that info, except supply ignorant science writers with sound bites?) we need to specify the proper comaprative index: functional allele differences, marker mutations for genealogy, population divergences, identifying founder effect gene fixations, etc

    In fact, even counting allelles is a matter of philosophy Is 'redhead' really a different gene from blonde or brunette if the base pairs turn out to be 99% the same (they aren't). On the other hand, your immune system may run on HLA27, while your brothers runs omn HLA8 -- entirely different genes serving the same function, and it won't matter unless one of you needs a transplant (may the The Gods of Immunology forgive me for that oversimplification!)

    Basically, an 'Allelle' (different gene form) is whatever we say it it, whatever is important for the specific question we are investigating.

    The 95% (98% 99.999%) number is useless and will always be useless except to hack science writers -- though the underlying principle of the commonality of genomes is useful. I've come to believe that the *number* is downright harmful to readers of hack science writers

    Suffice it to say that the human 'DNA copying mechanism has roughly an error rate of one per billion base pairs, and the human genome is roughly 3 billion base pairs. Every time a human cell divides, the daughter cells probably are a few base pairs different. The cells in your body now are typically dozens of generations away from your embryonic state and are not exactly identical -- but their divergent mutations are probably less than 1 part per Million

    Even most genes you'd die without ('important' genes) only have a few critical regions, and can mutate to varying degrees in the rest of the gene. Think of it this way: binding sites may have to be very precisely conserved, but the 'bricks' that hold them the right distance apart, and at the right orientation aren't so important.

    In fact there are entire families that are hypervariable: Immunoglobin genes (antibodies) are different, even between identical twins -- so are olfactory receptor genes (though there may be some fixed 'common' olfactory receptors)

    A lot of the confusion arises when people learn that (for example) a single tiny change in the B chain for hemoglobin can cause sickle cell. But that change alters the geometry at a 'corner' of the protein that throws the entire protein off.

    [it has been suggested that sickle cell hemoglobin is so widespread because it protects aginst malaria, and therefore served a valuable function. Malaria has been one of the biggest killers of humans since pre-history, possibly *the* biggest]

    'Variability of expression' is not the primary reason for the large differences. Subtle differences in genes (and the interactions between their products) can produce significant effects.

    __________

  • Check out this Globe and Mail story [globeandmail.com] about Sick Childrens Hospital in Toronto sorting the map - with a supercomputer. I've seen pictures - SGI Origins [sgi.com]all over the place. Cool hardware - now let's hope they "Do no harm" with any knowlege they gain.

  • Celera is using an obscene amount of AlphaServers from Compaq. They are doing large scale clustering as well as compute farms and "farms of farms". Lots of their process involves big compute but the biggies seem to be the actual assembly process plus the analysis of the finished data.

    I did hear at one point that they were going to gang together 400 or so 4processor AlphaServer ES40's specifically to handle either the assembly or analysis portion. The 400 servers X 4 600mhz EV6 Alpha chips would give you 1200+ cpu's in the cluster...the final version of this system is what they claim will be the 2nd fastest civillian owned supercomputer on earth.

    I don't work for Celera so mistakes made above are my own. I'm just a bioinformatics hardware geek and a big supporter of Alpha-for-life-science-research type projects. From a infrastructure geek's perspective what Celera is doing is just amazing...

  • The primary reason for the success of Celera's mapping effort was the simply incredible technological advancements in the DNA Sequencing hardware and procedures that made the process faster and cheaper than anyone thought was possible.

    10 years ago the state of the art was pretty poor. The HGP estimates were based on that technology.

    Celera's relationship with PE allowed them to get their hands on tons of the new 6700 series DNA sequencers. Without them Celera's effort would have been impossible.

    So-- Ventor does deserve some credit -- he was smart enough to realize that the revolution in sequencing (plus a cozy relationship with PE) had changed things enough to make a a large-scale private effort possible.

    just my $.02

  • Lets say Celera does finish the mapping before the Genome project can. Celera then sells to researches data from the mapping. What happens when the Genome project completes and gives the information away? Can Celera call it theft of intelectual property?
  • ... what's the "largest private supercomputer" Celera claims to have used? (quote is from the Wired article). Anybody got any info?

    engineers never lie; we just approximate the truth.
  • Since they're the first to do it, how can anyone be sure of the veracity of their claims?

    >

    Should I RTFM? IS there a FAQ on this somewhere?

    -pbk
  • Well, now they think they can produce a carbon copy of a human being at will -- or worse yet, a modified copy which has exactly the characteristics they want: Obedience to secular authorities, low IQ, lack of imagination, lack of faith, etc.

    This Black Helicopter moment has been brought to you by Genetic Engineering. At GE, we bring good things To Life.
  • I've always wondered exactly who's genome is being sequenced .... I always assumed there ones one guy or gal somewhere that had been chosen and they were doing them first - I guess they do a feww more later to compare to get an idea about intra species variation .... then a few chimps and gorillas to help figure out what makes us human.....

    But somewhere out there there's a person who's about to become the benchmark human that we're all going to be measured against .....

  • Nobody can "own" data. Only derived knowledge.

    The problem is, Celera and its friends are going to be the only companies with the full genome available for several months. Patent law says you can't patent something obvious to someone experienced in the field, but while the genome isn't widely available, Celera may be able to circumvent that clause temporarily.

    What I'm sure many people are wondering right now is: once the HGP completes its sequence, will these patents on medical knowledge derived from Celera's work be revoked on the grounds that the method has become obvious through independent public research?

    --

  • This brings up the whole issue of patenting and the Intellectual property ownership of part/all of the human genome. It is my understanding that parts of the human genome have been patented already, does this mean we no longer own ourselves?

    If I were to be marry and have a child would I be violating their patent by using the patented parts of the human genome? Ok so you could argue that a natural process can't be patented and the patents are for secondary uses, but what about medical cloning. If experiments continue into therapeutic cloning for the production of stem cells and potentially organs would this non-natural process not be a patent violation.

    If by chance God does turn up for his thousand-year reign could he not claim some kind of prior art? Ok so that's probably not an issue the patent courts are going to have to deal with in the near future but the fact that patents can be granted on existing genetic material does raise some interesting questions, by granting patent in this manner the patent office is dismissing the idea that the genetic makeup of us/animals/plants may have been designed and is thereby encroaching on an area of belief held dear to a lot of people.
  • Celera's business plan, as I understand it, is to create their own map of the human genome. This map will be different from the Public project's map in that
    1. They are using five different differnt people that the project it (presumeably)
    2. The actual sequencing was done in a different manner (shotgun sequencing which sequences millions of fragments of the genome and uses computing power to look for overlapping sequnces to order the fragments)
    3. The form that thier map takes will be diferent, and maybe proprietary. I don't know what this form will look like, since I haven't seen how they've presented the (freely released) Drisophila genome, but I know that that came as a CD-ROM. Perhaps their map will have certain programs, links to "pages" that tell what currently known loci (genes) do, etc.

      Basically they are betting that at least some scientists will pay for their map because of the way in which it rendered, maybe it will be easier to use, look prettier, run sequence search algorhythms faster, or something similar. But two independent copies of the genomic map (done in two scientifically proven methods) can only be better than one, so many scientists will probably end up using both maps for some percentage of their work/research/experimentation.

      You will just have to pay for the bells and whistles (and speed of release) of Celera's map.

  • Comment removed based on user account deletion
  • Celera mapped the genetic structure of the fruit fly [exosci.com]recently. They claim that they will have the sequenced genes of their human subject assembled in three to four weeks in an article at CNN [cnn.com]
  • Nice to see a coherent account from an expert in the field.

    Last time I checked the human 'rough draft' from the public project had about 80% of the sequence complete in draft form and in the public domain. The Celera project has nothing in the public domain except a few press releases.

    I update my databases every night from the HGP. It is doubling in data volume approximately every 7 months and the doubling time is getting shorter.

    Moores law eat your heart out.

    The HGP is providing us with data faster than we can analyse it, and really opening up a whole new level of understanding of how things work. One of my colleagues complained to me after I had given a seminar on Genome analysis that his labs old laboroius techniques of analysing family pedigrees and careful selection of regions to look for genes was being blown apart by the public sequencing projects.

    We are entering a new era of biology, one in which a biologist will need to be as handy with a keyboard as with a pipette. If you want to be a successful molecular biologist you will either need to be very, very good or have good data analysis skills.

    Enough of a winge. Any open source programmers out there fancy getting involved in writing code to help with the human genome analysis? plenty of odd tasks to go round.

    Dr. David Martin European Molecular Biology Network [embnet.org] node manager.

  • Celera has promised to release the entire sequence as soon as they are finished (much like they did with the fruit fly genome this month). They have always said that they will make the actual genome public. The press for some reason, likes to ignore this.

    Celera is actually doing two things here:

    1. Getting the raw sequence of the human genome and marking off all the genes we already know, as well as some "best guess" genes that are similar to other organisms that have been sequenced. This will be available from them for free to everyone.

    2. They then plan to go after genes we don't really know. A little explanation:

    Genes are how the body stores the information to make proteins (which get made into enzymes, cell signalling molecules, whatever...). They also make other things, but I don't want to complicate this. Largely, it's proteins that scientists are interested in becuase they are the machinery through which the body works. Cancer, for example, is caused largely by proteins that misbehave and refuse to do their jobs.

    Just knowing the sequence of the human genome tells you little about the functions of the genes. The proteins made from those genes must be studied and characterized. This is where Celera's business model kicks in. They plan to identify and characterize as many proteins as possible. This is a non-trivial task, given that some molecular biologists spend their entire lives working on one protein. Celera plans to look at the protein-protein interactions as well as their locations within the cell to get an overview of what all the genes in the human body are actually doing. It's real "big picure" stuff", meant to serve as a starting point for future research. It is likely that many of these proteins will have value as targets for drugs, and I think Celera plans to patent these genes to make money. They will at the very least charge a subscription fee to look at all the protein data they have collected. I am fairly certain that other companies have already patented human genes...without the patents, there is not a whole lot to protect a drug from being stolen by competitors.

    All of Celeras research will be at an ENORMOUS cost to the company. Should they make all the info free? The bottom line is that realistically, you and I are not going to develop the cure from cancer because we ran a perl script on the human genome. It takes a Pharmaceutical company with deep pockets to pay for all the FDA trials and get the drug ready for "prime time". Celera knows this, and they know these companys will shell out wads of cash to get info about as many proteins as possible. It is possible that university researchers will not have the money to pay for this information. But there is so much research to be done, the big pharmas will likely fund projects at universities to look into some of these genes more closey, so many of them will get what they want anyway.

  • I _believe_ that Celera recently stated that they would "share" their results. Exactly what this meant. See

    http://www.pecorporation.com/press/prccorp011000.h tml

    [excerpt] "Celera's mission is to become the definitive source of genomic and related agricultural and medical information. Celera's information will be available on a subscription basis to academic and commercial institutions who will have access to tools for viewing, browsing, analyzing, and integrating data in a way that will assist scientists in accelerating their understanding of the human genetic code."

    And then there's the courts and governmetns. Both the UK and the US govs. had indicated that they might not be too happy about a company attempting to patent the human genome. It certainly isn't too clear what it means to patent a gene sequence, that simplier issue is yet to be sorted out.

  • It should be remembered, however, that genes are not an exact blueprint that we will follow. They allow for an expression of a trait. They do not guarantee that it will be expressed. Your dad could be Michael Jordan, but if all you do is sit at your computer and eat junk food you won't make the NBA. To often, it seems that genes are portrayed as concrete instructions on who we will be. Sometimes leading them to be used as excuses for personality/traits. It is the classic nature/nurture debate I suppose. It is just worth remembering though, that although not totally irrelevant, genes are not the sole determinant of one's self. I am in no way doubting the significance of this medically in curing certain genetic diseases, but I am weary of the way I see genes being portrayed by the general public in terms of their effect on who we are as people. ---Lane
  • by CaseyB ( 1105 ) on Thursday April 06, 2000 @09:46AM (#1147682)
    ...at 2 bits per base pair (4 possible cases), you could fit a complete human genome in under a gigabyte.

    I'm curious: how well does gene sequence information compress? Is this effectively random data, or are there patterns?

    I can see the disclaimers now:

    The GeneStor PeopleBackup(tm) device can now store the complete* contents of your genetic makeup!!

    (* Storage assumes 2x data compression. Results may vary.)

  • I think, rather, that the patents should be limited to a company's own database that they've created, not the original stock that they created the database from. I don't see this as being much different from copyrighting a dictionary; both are privately-compiled lists of public domain information. In one case, it's a human language, and in the other, it's the human DNA sequence, but the principal remains.

    In short, then, I believe that companies should have the right to sell any information that they've collected. They should not, however, have any ability to prevent anyone else from collecting the same information, much as Webster cannot prevent Oxford from cataloging essential the same information.

  • by Spud Zeppelin ( 13403 ) on Thursday April 06, 2000 @07:25AM (#1147684)
    At least someone now has the technology to do offsite-backups of people... granted, there'll be a certain amount of data loss since the backup fileset was created (birth), but now, at least, there is the beginnings of real disaster recovery technology.

    Imagine: Our friends at Legato could license Celera's technology and produce "WetWorker" -- with the ability to put your genetic data on CD-Rom for easy transport to offsite storage. Then, when your friendly, egocentric ocean liner captain decides to go "All Ahead Full" on a foggy night in the North Atlantic AFTER receiving an iceberg warning, you can rest confident that your family can always recover you from archival backup.

    I'm aware that there are shortcomings (especially the part about "loss of all data accumulated since birth"), but after all, the centerpiece of any backup software isn't ease of recovery, it's ease of deployment. The data can always be reconstructed from "incrementals" (it pays to take good notes...).



    This is my opinion and my opinion only. Incidentally, IANAL.
  • by Ralph Wiggam ( 22354 ) on Thursday April 06, 2000 @07:44AM (#1147685) Homepage
    I don't think you could just "change" a Y chromosome to an X chromosome. I believe they're entirely different. Think about it, Y chromosomes give you the ability to drive well, while X chromosomes just give you the sense not to wear the same socks two days in a row.

    -B
  • by Robert Link ( 42853 ) on Thursday April 06, 2000 @07:53AM (#1147686) Homepage
    According to Top500 [top500.org], the fastest supercomputer that was not at a government installation was a Hitachi SR8000/128 at the University of Tokyo, which weighed in at number 5 overall. If you want to discount academe, the fastest owned by a business appears to be at Charles Schwab. It's a 2000 processor IBM SP PC604e, and it rates number 12 overall. So, either Celera got a very big machine since those statistics were compiled, or they are using a different standard of "bigness" than the LINPACK benchmarks used in the list, or they were playing a little fast and loose with the truth. I would tend to bet on option number 2, myself.


    -rpl

  • by Monte ( 48723 ) on Thursday April 06, 2000 @07:31AM (#1147687)
    Literacy is in short supply amongs most around here.

    Q.E.D.
  • by Phrogman ( 80473 ) on Thursday April 06, 2000 @07:19AM (#1147688)

    The human gene sequence is in the public domain and will remain there - anything else would be ludicrous (although I agree that when it comes to the law ludicrous seems to be perfectly acceptable. Witness patenting software algorithms). What I believe they will get the patent on is their process for deriving the gene sequences - which is perfectly acceptable. They will also have the rights to their database of human gene information, which they can license the access rights to. The Human Genome Project will be making its results publically available, so it might become a matter of whose database provides the most ancilliary information.

  • Unfortunately, the whole gene patent scandal is because of this point exactly... they *are* being awarded exclusive rights because they got there first. Bad, bad, bad, bad, bad.

    I have every right to do whatever I want with any gene I've sequenced myself, damnit! I shouldn't have to pay royalties to someone because they sequenced it first!

    The analogy is the spanish and portuguese "claims" to the americas, not a translation of Vergil.
  • by gwernol ( 167574 ) on Thursday April 06, 2000 @08:07AM (#1147690)

    So, say you could change something cosmetic about yourself genetically for a reasonable price. For example, what if a virus were available that triggered a whole-body genetic mutation, and the end result was a change in your genetic hair color?

    So this raises an interesting question that I've wanted an answer to for some time. [side note: I am not a biologist, nor do I play one on the Net, so please excuse me if this is a dumb question].

    One of the much touted advantages of genetic engineering is the ability to cure genetic problems in living humans. This is distinct from altering the genetic code in cells that will go on to form a viable human foetus.

    So, say I have some genetic disease caused by an unfortunate sequence in my DNA. Assume we know what replacement sequence would cure this problem. On an engineering level, how would I go about making the change in every cell in my body? This is what I would have to do, right? Is this an area where nanotechnology and genetic engineering meet? Or could genetically-modified viri really perform this task?

    I assume that now we are closing in on getting detailed genetic information about humans, people are starting to think about how gene therapy might be applied in practice. Does anyone have anything they can share with us on this subject?

  • by VAXGeek ( 3443 ) on Thursday April 06, 2000 @06:56AM (#1147691) Homepage
    Great, now how much longer until a public beta release?
    ------------
    a funny comment: 1 karma
    an insightful comment: 1 karma
    a good old-fashioned flame: priceless
  • by raygundan ( 16760 ) on Thursday April 06, 2000 @07:01AM (#1147692) Homepage
    Is anyone else bothered by the fact that the first group to have a complete sequencing of the human genome is a private company? If anything ought to be in the public domain, all other arguments about software, music, etc... aside, it is the human genome. After all, everybody already has their very own. Celera deserves to reap the benefits of getting there first, but only until somebody else can get there as well. If another group finishes the sequencing, they have just as much right to use it as Celera. It's not like Celera has created an original work-- they've just finished reading through the genome first.

    I really hope that the HGP places this information in the public domain as soon as possible, and refrains from signing any exclusionary deals with Celera that would prevent this information from being free.
  • by zavyman ( 32136 ) on Thursday April 06, 2000 @07:12AM (#1147693)
    What would you think if you spent a great deal of time and money on sequencing the genes, only to give it away? Of course they are in it for money, and they can keep what they find to themselves. Its not like by being the first to sequence them they get the exclusive rights to the genes.

    Liken it to a translation of an old text such as Vergil's Aeneid. Anyone can translate it and sell the translation -- it is still an original work. There are also many translations, some more correct than others.

    In the same way, they are translating the genome and retaining the rights to that translation. Nothing is preventing someone else from spending the time to get their own. In addition, how do you know what they present are correct. If you need accuracy, do it yourself. The genes are in the public domain: you have your own copy!

    The information like this should be free, but it doesn't have to.
  • by wowbagger ( 69688 ) on Thursday April 06, 2000 @08:49AM (#1147694) Homepage Journal
    Contents of a writ delivered to Celera. I won't say how they were leaked...

    To: Celera, Inc.

    From: God

    Your attempts to reverse-engineer my closed source project "man" are in violation of federal law. The data encoded in the media "DNA" are encrypted, and in circumventing that encryption you are in violation of the Digital Millenium Copyright Act.

    Cease and desist all furthur attempts to decode this information, and destroy all copies. Do not disseminate this information, and notify all mirror sites to do the same.


  • by Frank Sullivan ( 2391 ) on Thursday April 06, 2000 @07:02AM (#1147695) Homepage
    Oh give me a clone
    Of my own flesh and bone
    With her Y chromosome changed to X
    And when she is grown
    My very own clone
    She will be of the opposite sex (hurray!)

    Clone, clone of my own
    With her Y chromosome changed to X
    And when she is grown
    Since her mind is my own
    She'll be thinking of nothing but sex!

    (written by Robert A Heinlein)
    __
    (oO)
    /||\
  • by orpheus ( 14534 ) on Thursday April 06, 2000 @07:49AM (#1147696)
    It's very important to make the distinction between mapping and sequencing.

    SEQUENCING means creating a complete list of the nucleotides in order. If you had this information, you could actually synthesize the entire genome of the individual. [There are some sophisticated niceties like methylation that distinguish the synthesized version from one extracted from a human, but it's essentially complete.] There are other factors (like which regulatory binding sites are actually bound, by what proteins; exact state of histone supercoiling, etc.) that control gene expression enough to keep this from being a working human genome, but it's awfully close.

    MAPPING means determining distances between known genes. Using this information, you can deduce where the various genes are, the approximate location of specific unknown genes, and many other useful facts. A detailed map is a good starting place for hunting down a gene, so you can locate and sequence it; it also can tell you what traits are likely to be inherited together, etc.

    A "sequence" is a complete blueprint (though there are details that aren't covered by sequence alone) A map is like a geographical map that shows where all the cities and large towns are. There are still many factories, facilities, and industrial complexes off that map -- not to mention all the roads, rivers, mountains and utility lines. ETC.

    A sequence is a lot more information, and a wonderfully compact database - at 2 bits per base pair (4 possible cases), you could fit a complete human genome in under a gigabyte. (That's only one human, however.)

    Naturally, even once we had the genome (or preferably a few thousand individuals, to let us get a real handle on variations), we could still spend decades or centuries figuring out what it all meant. 3x10^9 bases is a lot of info. You thought it was hard trying to trace western civilization in the first million digits of pi.

    I am not a Molecular Biologist - anymore. But I was, about 10 years ago.

    __________

  • by seaneddy ( 121477 ) on Thursday April 06, 2000 @07:40AM (#1147697) Homepage
    If we use Celera's definition of "complete" then the public project is already done too.

    Any reasonable person would define "complete" as this: there's three billion bases of human DNA in 24 different linear chromosomes. The sequence is complete when you can give me a DVD with 24 files on it, each of which contains a contiguous sequence of a human chromosome.

    That may never happen for any large animal or plant genome. Too many regions of a genome sequence are an ungodly mess, repetitive and difficult to sequence.

    The public worm (C. elegans) project, at 98 million bases, defined "essentially complete" as "we've come as close as we can to complete using existing technology". We have 97 million bases sequenced and about ~50-100 remaining gaps.

    The fly (Drosophila melanogaster) project, at 180 million bases in size, was recently declared "substantially complete" by Celera. They have 120 million bases of sequence, with several thousand gaps. The fly has more extensive regions of repetitive sequence than the worm.

    The human, at 3 billion bases in size, is nowhere near complete, either by the public (us) or by Celera, no matter what Celera press releases say.

    You need the following steps to get close:

    1. shotgun coverage. Technology limits us to reading ~500 bases of sequence at a time, so we have to blow the genome to bits, sequence millions of fragments, then assemble it all back (computationally) into a contiguous sequence. Because a successful assembly relies on deeply redundant overlap amongst the fragments, we need ~8-10x shotgun coverage (24 to 30 billion bases) to try to assemble the human genome. The fly genome was shotgunned to 12x coverage to achieve the results Celera reports.

    2. Assembly. Once you've got shotgun data, you can try to assemble the genome from those fragments.

    3. Finishing. The automated assembly (like the fly genome now) will have a great number of gaps. These must now be closed, more manually, by expert molecular biologists; the gaps represent regions that are biologically difficult to sequence.

    The actual science behind the Celera press release is that they have partially completed phase 1. They currently have 4-5x shotgun coverage of the human genome, about half of what they need for a proper assembly. They intend to get the other 4-5x coverage from the public "rough draft", which is at about the same stage Celera's project is in.

    The two projects (Celera and public) are neck and neck in this "race". The difference is that we acknowledge that our sequence is a rough draft at this stage; whereas Celera claims that their sequence is complete. Celera has every right to spin their project to their investors any way they feel is appropriate, but scientifically, they are being rather disingenuous if not dishonest.

    conflicting oblig. disclaimers: I'm a co-PI on the public project, and I (accidentally, through an acquisition) also hold substantial stock in CRA.

  • by Signal 69 ( 159601 ) on Thursday April 06, 2000 @07:11AM (#1147698)
    Yeah, everybody's genes are "different." However, DNA is composed of "introns" (non-coding interening sequences, which make up a majority of genes and aren't don't code for anything) and exons - expressed sequences which do the coding.

    Although DNA fingerprinting is mostly accurate, it is based on differences in the introns, which are highly variable. As far as exons are concerned... You've probably heard that chimps and humans are 98% or so genetically similar, and humans and hamsters are 95% genetically similar.

    If you compared the genes (exons) of any 2 people, you'd find them to be 99.99999999% or more similar. The differences are very slight. What makes people unique is not the genes so much as which ones are expressed.

There is no opinion so absurd that some philosopher will not express it. -- Marcus Tullius Cicero, "Ad familiares"

Working...