Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Biotech Science

Celera Opens Up DNA Database 181

greenplato writes "Thirty billion base pairs from the sequences of humans, mice, and rats that were available only by subscription to Celera's DNA database are being put into the public domain. Celera will donate this information to a 'federally run database,' presumably GenBank. Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.' Stories in BusinessWeek and The New York Times."
This discussion has been archived. No new comments can be posted.

Celera Opens Up DNA Database

Comments Filter:
  • Shouldn't that be (Score:5, Insightful)

    by Spetiam ( 671180 ) on Saturday April 30, 2005 @09:26PM (#12395365) Journal
    Shouldn't that be "data want to be free?" :)
    • "Shouldn't that be "data want to be free?" :)"

      Here in the USA, no. Yet another reason why it doesn't pay to be a grammar nazi.
    • Or actually, I was thinking something more along the lines of, "All your DNA are belong to us."
    • Shouldn't that be "data want to be free?" :)

      Okay, it's probably just me but when I read that I had a vision of Brent Spiner rattling the bars of a cage yelling "Picard, get your bald ass down here, Data want to be free!"

    • The problem is, the researchers spent too much time studying the "free as in beer" part, and were much too drunk. Besides, they didn't feel like inscribing the GPL into just four base pairs.

      (On the flip-side, this is excellent news. Researchers have a long history of putting things in the public domain - they have been the main driving force behind the idea - and it is most excellent that commercial researchers are beginning to realize that this isn't purely by chance.)

    • In British English, yes. In American English, no. Data is a group plural, not a plural.

      In British English, the populace want to be free. In American English, the populace wants to be free. Limeys think that a collection is a set; Yanks think it's a singular.
  • by Anonymous Coward on Saturday April 30, 2005 @09:33PM (#12395405)
    Francis Collins, head of the National Human Genome Research Institute, notes that 'data just wants to be public.'

    Data hates when you anthropomorphize it.
  • by chriswaclawik ( 859112 ) on Saturday April 30, 2005 @09:39PM (#12395446)
    Considering the millions of dollars that Celera invested in gene sequencing, it should at least have the opportunity to make back that money. Heaven forbid, they might even deserve to make a PROFIT. Profit is a leading motivation of many corporations, you know...
    • Besides a profit, they should at least recoup their costs.

      If that has already happened, then I can see why they are releasing the information.

      ....

      Okay, I just RTA and it turns out that the subscriptions just weren't profitable to continue doing it.

    • by h4rm0ny ( 722443 ) on Sunday May 01, 2005 @04:25AM (#12396955) Journal

      Considering the millions of dollars that Celera invested in gene sequencing, it should at least have the opportunity to make back that money.

      If he were creating something new then perhaps, but it was just a land grab. The DNA was there and they tried to patent as much of it as possible. It reminds me of the Eddie Izzard skit when the Europeans claim America and the Indians say, "but it's here, you know, we're using it, how can it be yours?" And the Europeans say, "but ah, have you got a flag?"

      Replace flag with patent. You might as well say that the Spaniards spent a lot of money colonizing Peru so they deserved all the gold. This is DNA! It belongs to no individual or corporation. I want access to my source code for whatever purposes I choose.
  • Oh No! (Score:5, Funny)

    by Anonymous Coward on Saturday April 30, 2005 @09:41PM (#12395453)
    They've open sourced me! Does this mean I have to call myself GNU/Steve?
  • by Krankheit ( 830769 ) on Saturday April 30, 2005 @09:41PM (#12395454)
    Hasn't much of the human genome been patented by greedy companies?
    • Hasn't much of the human genome been patented by greedy companies?

      In a word, no.

      You can't generally patent "found" sequences. You have to create or assemble something novel. The raw sequence of the human genome is not patentable. Inserting novel or transgenic genes into the human genome might be, but that's still science fiction.
      • by the gnat ( 153162 ) on Sunday May 01, 2005 @12:07AM (#12396147)
        You can't generally patent "found" sequences.

        I wish that were not the case. However, there are many gene patents in existence. The trick is that now you have to show a function for that gene - although bioinformatics is sophisticated (or rather, automated) enough that you can come up with a plausible-sounding function without ever doing benchwork.

        What's really being patented is the medical application of these sequences. For instance, Company X discovers that gene Y is overexpressed in cancer Z. They take out a patent on gene Y based on this discovery. That means that no one else can pursue gene Y as a therapeutic target. Moreover, in one case testing for a specific mutation to detect cancer was covered by a patent. This is a very simple piece of labwork being covered, which any competent cancer researcher could have figured out.

        The end result is that patents are being awarded for hard work, not for novelty and invention. Throw enough money at a subject, and you'll get data but not necessarily results. Since companies (or academics) can now patent just the data, if someone else gets "lucky" and comes up with an actual result the patent holders can sue the tar out of them if they try to make money off it. (Or even if they don't, as in the case of the breast cancer gene; the company wanted people to pay three times as much for its own testing kit.)

        You may soon be able to patent single-nucleotide polymorphisms (SNPs), which may be involved in differential drug responses. Back when I was in college we had a guest lecturer who was a biotech patent attorney, and he said he though SNPs should definitely be patentable. In any case, there is a world of difference between patenting a cancer drug, and patenting a gene (or a FUCKING POINT MUTATION) that may, in the future, be a drug target.

        Since most of the human genome is noncoding, I suspect it will be harder to patent pieces of it. I also suspect that some asshole will try anyway.
        • The trick with patenting SNPs is that there are 10,000,000 common ones, so that's a lot of money to spend on patents to 'cover' your disease. The number of true postive believable association studies using SNP data is still very low (we're up around ~10 or so now, which is far better than the 2-3 we had a few years ago.)

          So, that's 10 SNPs to patent - except most of them were published papers comming out of acedemia, so they can't be patented. Now, if you can create a drug that acts to affect the changed
    • No. You cannot patent something you didn't create; the whole "patenting the human genome" thing is nonsense.

      Celera spent the time to cateogrize and sort the reasings of thousands of human individuals into a comprehensive statistical analysis of the genome, and then sold the results.

      They're no more evil than the Encyclopedia Brittanica.
  • Again? (Score:5, Insightful)

    by zappepcs ( 820751 ) on Saturday April 30, 2005 @09:44PM (#12395474) Journal
    FTA "DNA database are being put into the public domain" Again, we find information and data that SHOULD be in the public domain, yet the patent office, government, and kickbacks protect those that stand to make money? Its time that we, as a populace, stand and shout for the rights of the public to information. Sure, there are those that say that without protection, such innovation would be stiffled, and I counter with this... "should such efforts be in the public sector?" Through emminent domain, they can take your property, but if you are a business, there seems to be no such thing. I hear of companies giving to this charity or that... but none are giving to the charity of mankind? Information is power, and in this information age, it is time for those with the information to take power from those that would use it to extort finance and power from those that do not know better. All such information should be in the public domain. Knowledge of the human genome, of anything that affects ALL of us, should be public information. For instance, any method of retrieving emergency information during an emergency should be in the public domain, not a subject of patent worthiness. The entire point of 911 service is to aid the community, not bilk them of dollars. The entire point of scientific discovery is to learn and advance humankind... when it becomes simply a method of making money, the advancement of humankind goes in the trash like yesterdays junk mail. At that point, what is the point of funding science? Think bigger than your new BMW. This might seem altruistic, but what is the point of discovery if your only reason to share is profit? When do you lose respect, when do you stop having authority? The ONLY method of advancing the human race is through sharing, through communal discovery. Perhaps this will advance that purpose, perhaps it won't.
    • I completely agree that unrestricted access to is optimal. I am thrilled about the growing trend for (at least CS) researchers to put their work where anyone can access it.

      However, until people are willing to pay for research to be done for the common good things will not change. Given the severe underfunding of the NSF and other agencies it is clear that the public does not care about the current situation.

      So if the public is unwilling to fund research and there is no IP protection to encourage the priva
    • Re:Again? (Score:4, Insightful)

      by Saeed al-Sahaf ( 665390 ) on Saturday April 30, 2005 @10:20PM (#12395649) Homepage
      Yes, yes, yes... But who is going to fund all this discovery? If it's "the public", than of course "the public" should be able to access it (although I don't think most of us could make much use of it), but if on the other hand it is some private concern that is doing the research, than they have every right to obtain value from their investment. That they are being put into the public domain is a great thing for Celera to do. If they want something out of it, I see no problem with that, I'm sure they spent a lot of $$$ to do the work.
      • It's kinda hard, isn't it?

        I think that on these things, companies should be given limited access - perhaps for a few years, so that they can capitalize on their investment. After about 5 years or so, they'd better make it public domain.

        Ofcourse, in that case, companies will wait for a good while before making it public that they indeed do have the data.
        • I think that on these things, companies should be given limited access - perhaps for a few years, so that they can capitalize on their investment. After about 5 years or so, they'd better make it public domain.

          Great idea, isn't it? It's called "patents", and they have thought of it a while ago. The problem is mostly with the current implementation.

          • Mr. Genius, you can't copyright "facts" - only methods for obtaining those facts, which these companies do anyway.
            • Just to be picky, you can't Copyright methods either.

              You can Copyright your writings about a fact, or your pictures of that fact, or your rantings about your discoveries of the fact, but not the fact itself.

              You could trademark a fact, or even patent a method for discovering a fact.

              However, most companies depend on trade secrets and licenses for these situations.
              • Ah, my bad - I meant you can't patent facts, only the methods. Didn't realize I had said you can't copyright.
    • Again, we find information and data that SHOULD be in the public domain...

      Are you sure you don't want to add "make love not war" to your rant?

      The data generated would not EXIST had not investors (read people) put millions of dollars into the company to hire the researchers, buy the equipment, and develop and analyize the data. Odd that, at some point, they'd hoped to get their money back.

      Some people, unlike most here it seems, understand that INFORMATION is not free, that it costs time and money and

      • Yup. If the parent wants the data to be free then eh can pay for it or do it himself. Thsi was a purely private effort and has every right to keep the data private.
        And there WAS a public project to sequence the human genome which did rather well. If you want the data to be public then the public has to pay for it, or have some altruistic individual pay for it. Can't get something for nothing.
        The parent really should keep the following in mind: if the data wasn't private then there would be no Celera data (s
    • Again, we find information and data that SHOULD be in the public domain

      Why should it? They spent tremendous amounts of effort and money discovering and cataloguing that data. Should the Brittanica be public domain?

      You could always sequence the genome yourself; nobody's stopping you.
  • One problem... (Score:1, Redundant)

    by symbolic ( 11752 )

    Who holds the patent for "viewing alpha sequences comprised of the letters G, A, T, and C, superimposed on a dual helix-shaped structure...on the internet"?
    • I don't know... but I do know the fellow who has the patent on viewing said data by means of a graphical or textual representation through a network consisting of interconnecting computational devices...
  • Curious (Score:4, Interesting)

    by Sparr0 ( 451780 ) <sparr0@gmail.com> on Saturday April 30, 2005 @09:48PM (#12395500) Homepage Journal
    I wonder why something like this isnt inherently unprotectable, like the contents of the phone book. A DNA sequence is, after all, simply a record of an existing state of things, NOT an original work (barring genetic engineering, which this isnt). If I take your phonenumber/basepair book and reproduce it... have I broken any laws (apparently the answers are no and yes, in that order)? The precedent for this has existed for decades.
    • Copyright has never protected ideas, only expressions of an idea in a fixed medium. So, yes, a phone book can be granted copyright protection but the phone numbers themselves cannot be copyrighted since they are not original ideas. Gene sequences may not be granted copyright protection (they can be patented in the US), but the database and the way it presents this information can be granted protection. The important thing to remember, and why reading any discussion about copyright on Slashdot is extremel
      • Yes, but the expression of an idea that is a movie also effectively copyrights the idea, because any future expressions of that same idea can be argued to be derivative works of the original movie. However, this same extension has been held to NOT apply to phone books, where the secondary works are most definitely derivatives of the original (insofar as the idea (database) only exists publically in the form of the particular original expression).
    • Re:Curious (Score:3, Interesting)

      by John Hasler ( 414242 )
      > I wonder why something like this isnt inherently
      > unprotectable

      The data itself was never protected in any way: you've always been free to read your own DNA. The database that Celera owned was protected as a trade secret. You could only look at it after signing a contract in which you agreed not to disclose what you saw.
      • The database that Celera owned was protected as a trade secret.

        And under copyright. Anyone else is free to duplicate a private genome database if they're willing to spend millions of dollars on sequencing. However, you couldn't take someone else's proprietary database and redistribute it. I assume the trade secrets were any specific annotations that Celera had made - for instance, you couldn't subscribe and then start blabbing about their annotations, or re-annotating the public database based on their
    • It is fundamentally unprotectable. They're not stopping you from re-sequencing the genome. They're just saying that you have to pay for their copy. You can't copy the Yellow Pages either; you have to start from scratch.

      Please make the distinction between copyrighting the data and copyrighting one instance of the data. I can copyright a photograph of a magnetic field, if I want to. That doesn't stop you from making one, but it does stop you from copying mine.

      The only difference here is the tremendous
  • by Anonymous Coward on Saturday April 30, 2005 @09:51PM (#12395513)

    I work for a biotech company with a database which we've been trying to sell subscriptions to for a few years. The prevailing experience with trying to sell the database is that people are very reluctant to shell out the cash to access the data.

    I think this is a symptom of trying to sell data to academic institutions. The problems with selling to academic institutions are two-fold; Firstly the universities don't have the cold hard cash to spend on the databases, so any cost over free is too expensive. Secondly, there is the free/open culture within universities that almost punishes commercial ventures for trying to build a business around adding some kind of value to the data (such as convenience or quality of data).

    Because of the lack of sales for this database, we're considering handing the data over to a large government body so that they can maintain it, because the company can't simply afford to maintain the database - it costs a lot of money to hire talented people to do database curation.

    So when Celera say that "data wants to be free", I think they mean "We'd sell you this data to try and recoup our investment, but we're resigned to the fact that you're not going to buy it".

    • We'd sell you this data to try and recoup our investment, but we're resigned to the fact that you're not going to buy it


      It's a wonder that A) Celera hasn't started sueing other parties with similar datasets or B) The **AA hasn't validated this line of reasoning and stopped sueing filesharers.
    • by the gnat ( 153162 ) on Sunday May 01, 2005 @12:21AM (#12396218)
      Secondly, there is the free/open culture within universities that almost punishes commercial ventures

      I would not have stated it that way. The real reason is that academics hate to leave anything unpublished. If they're constrained by copyright law or some NDA, they can't tell everyone about the fabulous new work they've been doing - or at the very least, it becomes much more difficult.

      I worked in bioinformatics at a university for several years, and much of what we did was take existing databases and analyze them, then publish the results online as our own database of annotations. As part of this, we reproduced much of the original database in modified form - and all we had to do was cite the original authors and describe our methods/sources. If the databases we used had not been public, none of these projects would have happened. In some cases, we had to ignore private databases that we had limited access to because we were not allowed to reproduce any of their data.

      This is only cultural to the extent that academia thrives on publications. We're not out to punish anyone from trying to make an honest buck (lots of people here collaborate with or consult for companies), but we literally can't afford, professionally, to limit ourselves in accordance with restrictions on databases. So why pay money for something we can't legally use in the manner to which we're accustomed?
  • Sure the public can view the DNA but did Celera surrender the patents too??
  • Finally... (Score:2, Interesting)

    by nxtr ( 813179 )
    Now what do I do with it?
  • If wonder if SCO have threatened to sue them?

    Personally, I think the real reason is the companies can't make a profit by simply having the "standard definition" and its effectively useless to them.

    To 99.99999% of the population, these base pair sequences could be random bits, and we wouldn't know a chromosome if it came up and bit us on the ass.
    They are holding a single sample of data, when in reality whats needed is the variation patterns based upon this starting point. We could start to see just how di
  • by MillionthMonkey ( 240664 ) on Saturday April 30, 2005 @10:13PM (#12395616)
    I hear what you're saying about academic institutions. They're incredibly whiny and expect everything to be free. We make very little money off of them, and they consume a large share of tech support, but we go out of our way to be nice to them because many of the same people later pop up in pharmaceutical companies in control of large quantities of cash.

    Celera saw the writing on the wall. Everyone is using the public reference assembly because it's free, and in terms of contents the two are merging toward a complete consensus as they approach total coverage. You can only make money selling this kind of information while vast portions of the genome remain unknown or unavailable, and that's not true anymore.

    Plus using a different assembly than other researchers cuts you off. When we import data from dbSNP, for example, we regularly drop references to positions specified in reference to Celera contigs. (Not much of a problem, since they're in the vast minority.) The Celera assembly has not been freely downloadable and redistributable, and we haven't been including a copy of it in our software (we always include a current public assembly build). Now that this has happened, I think the next build of the public assembly is going to be really good.
  • No more security through obscurity... and if they do have security patches forme, I would rather not have to recompile.
  • Whos DNA is it?
  • I'll tell you exactly what it wants. Human genome data wants to be anthropormorphised.
  • by $exyNerdie ( 683214 ) on Saturday April 30, 2005 @10:29PM (#12395695) Homepage Journal

    Excellent PBS video on race between government and Celera to crack the human genome:

    http://www.pbs.org/wgbh/nova/genome/program.html [pbs.org]

    Mirrors please..

  • Here we go... (Score:1, Interesting)

    by Anonymous Coward
    ..with the typical /. groupthink. Everyone around her would like to think that the genome sequence should be free to the public. And liken this to open source software. I don't disagree with this. However, we must remember that one can sell a service. An annotated database of the Genome sequence is a service. Although it doesn't contain unique "created" data, annotation and organization is a huge undertaking in itself. Yes, it's horrible that a company invested money and resources towards capitalizing on so
  • I spill^H^H^H^H^H^H open up my DNA database everyday!
  • or he'll write a bill preventing the data from being released.

    Oh wait, there's no corporation for him to whore himself out to. Maybe this will actually see daylight.

  • by MagicDude ( 727944 ) on Saturday April 30, 2005 @11:11PM (#12395905)
    Here's a copy of the data

    acgcggcgatgcgtacatagctagcgctgcatagatcgactatgacgatt atgactgatcggtagcatatattatgctatagctagcgtgtagctagtat cacatcagctactatgtagctacgatcgagcacactgactacgtagctag tagcggatcgatagctgatctgactgactatatatagcgcgcgatatata gcgcgtagatcgtagccgcgcgatgatatataaggagactgactagc...
  • Does anyone remember the story of the hacker that actually wrote the code that cracked the genome sequencing problem? He is the unsung hero of this whole private vs. public debacle. He wrote a 10,000 line C program to do the sequencing in "rafts" and "contigs" in the space of a few days -- and had to ice his wrists from all the work... it was because of his brilliant work that the race went from being a 20-year thing to a 3-year thing, and of course nobody knows his name. (And I've forgotten it.)
    • I'm afraid that sounds awfully like a romantic myth. There was a multitude of reasons why the HGP went from a 20 years to 3 years (and that depends on where you begin counting), but it was not because of any one person. It was first begun in England, and before most people considered it prudent to do so. The sequencing machines were fairly cumbersome to prepare, the analysis lacked many of the automatic programs available today, and the money was lacking.

      However the progenitors had the forsight (rightly s

    • by Anonymous Coward
      Not to discount Jim Kent, but your post is riddled with errors. The "race" you speak of was really just an ego thing, anyway. Neither the public nor provate sequence is technically "done" yet even today. Don't believe me? Look at the sequence, the tens of thousands of N's you see aren't supposed to be there. If there ever was a race, it's still on - it's just not covered in the news.

      Jim Kent did not sequence anything. Big machines run by lots of people around the world bought with your tax dollars did that
  • by glwtta ( 532858 ) on Saturday April 30, 2005 @11:39PM (#12396040) Homepage
    I supposedly do this crap for a living, and I find out about this from slashdot.

    Anyway, Celera seems to epitomize the way large projects like this become free: they sink billions upon billions of dollars into a project which is soon supplanted by a better free (though, of course, government funded) alternative, and after years of unsuccessfully trying to sell it, release it for free for a bit of good PR.

    But then again, they've made a huge contribution to the field overall; Craig Venter may be an arrogant prick, but he gets shit done, while Francis Collins mostly waxes poetic about the bright future of genomics.

    Well, that seems like enough venting about the sad state of research.

    • didn't it work the other way around though? The government project had had billions of dollars sunk into it and over fifteen years to completion when this private company came along and looked like it was going to finish in only a couple.. both projects finished ahead of schedule if I recall.
      • Celera could not have finished their project without directly using data from the government one (there are plenty of articles out there explaining the differences and complements in their methodologies far better than I could).

        More important are the sequencing techniques that were developed in that first decade. They are a far more important contribution to the field than the completion of the one genome (which is really just a lot of very tedious work).

  • Craig's sequence? (Score:1, Insightful)

    by Anonymous Coward
    Craig Venter better hope his health/life insurance company doesn't take a closer look at the sequence and drop him for "pre-existing" conditions.

    In all seriousness however, Celera's sequences essentially suck anyway. The public projects have handily beat them and their sequencing methods have been deemed inferior (see last October's issue of Nature). They are not adding any scientific value by releasing their versions of these three genomes.
  • by Sentriculus ( 880382 ) on Sunday May 01, 2005 @01:10AM (#12396405)
    Someone has probably already pointed out that human DNA contains 3 billion base pairs and not 30 billion. It is a sad shame that a company as renown as Celera is overshadowed by blatant misinformation; even from former CEO Craig Venter who is known for calling archea a type of bacteria in the December 2004 issue of SCIENCE magazine. Mishaps like this further alienate the real intellectuals who would normally be capable of over-running the Internet towards an information rapture in the scientific community.

    -Bio major/Nerd
    • ... humans, mice, and rats

      The information was most likely taken from a press release by Celera. Press releases tend lean to hyperbole so long as it remains technically truthful. Either there were a heck of a lot of mice and rat genomes, which along side the human totaled to 30Gbp, or much of the data is redundant.

    • Err from TFS - "Thirty billion base pairs from the sequences of humans, mice, and rats that were available". So it's 3 organisms not 1. So that's lets say 9 billion for good measure. Lets say they also deposit their reads, contigs etc. independently they could well be hitting 30 billion base pairs couldn't they?

      Bionerd indeed.
  • Anyone interested in the HGP should grab one of the many books written on the subject. I highly recommend The Common Thread by John Sulston & Georgine Ferry. Noble laureate Sulston was there right at the very beginning of the public project, and could take a great deal of the credit for the projects existance if he was less than the disinterested scientist he is.

    The book is very readable, and from my own experiences rings of the truth.

  • It's already free (Score:5, Informative)

    by jezmund ( 102188 ) on Sunday May 01, 2005 @02:57AM (#12396756) Homepage
    Genomes are available at [ensembl.org] http://www.ensembl.org/ [ensembl.org] . I know I've said this before, but I feel it can't be overemphasized. Ensembl is so incredibly cool. I imagine Celera is releasing their data because no one wants to pay for it when Ensembl has it for free. Additionally, Ensembl has tools that provide so much more than just genome sequence-scanning. And they use open source projects like BioPerl and use Wiki for documentation! I think this is just a PR stunt for Celera.
  • 'Caldera Opens NDA Database!'

    OK, heart rate is lowering now...

Arithmetic is being able to count up to twenty without taking off your shoes. -- Mickey Mouse

Working...