Forgot your password?
typodupeerror
Medicine Biotech Privacy

New Encryption Scheme Could Protect Your Genome 78

Posted by samzenpus
from the keep-your-eyes-off-my-genes dept.
sciencehabit writes "As the cost of genetic sequencing plummets, experts believe our genomes will help doctors detect diseases and save lives. But not all of us are comfortable releasing our biological blueprints into the world. Now cryptologists are perfecting a new privacy tool that turns genetic information into a secure yet functional format. Called homomorphic encryption, the method could help keep genomes private even as genetic testing shifts to cheap online cloud services."
This discussion has been archived. No new comments can be posted.

New Encryption Scheme Could Protect Your Genome

Comments Filter:
  • New? (Score:4, Informative)

    by jbmartin6 (1232050) on Monday February 17, 2014 @05:44PM (#46270665)
    This isn't new [wikipedia.org], although the application with gene sequencing might be.
    • However, I suspect that every new application requires the method to be applied differently. Also, for every new application, other attack vectors might be possible so it is crucial to sort these out. Just thinking.

      • by buswolley (591500)
        Can I encrypt it in my own body? If not what is the point. I leave cells everywhere.
        • That requires a mitochondria upgrade at extra charge. Or else you might experience complications.
        • by Immerman (2627577)

          It doesn't protect you from that hot blond taking a strand of your hair to the local gene-scan station before going on a second date, but it does mean that the guy that hacked the NIH genetic database won't get the DNA of 400,000,000 people in one fell swoop. Though of course it probably also means that the NIH database will require thousands of times the storage capacity since de-duplication can't be applied to the massive genetic overlap between individuals.

          And the nosy blond could be mostly stymied by la

          • by tragedy (27079)

            Though of course it probably also means that the NIH database will require thousands of times the storage capacity since de-duplication can't be applied to the massive genetic overlap between individuals.

            The human genome is what? About 1.5 Gigabytes? That's a lot of data, but far from unmanageable. Store two copies for redundancy and you have 3 Gigabytes. Let's round down a bit and say you can get 600 people's DNA onto 2 TB worth of drives. Let's say you pay $120 per terabyte, then you're paying 20 cents per patient for two copies. Of course, this will be enterprise class storage for medical purposes, so let's say $4 per patient. Not exactly bank-breaking. Anyway, you haven't presented any good reason why y

            • by pepty (1976012)

              Researchers realized that the complex algorithms used during genetic tests could be closely approximated by the two basic mathematical operations. Lattice cryptology enabled homomorphic encryption, allowing computers to analyze encrypted data and return encrypted results without ever being able to decode the information.

              I can't see how it would be very useful for actual genetic research either, since researchers generally need the decoded information as well as personal and family medical history when interpreting the results. The 10^9 higher computational overhead would also be a huge problem in research since, unlike a medical test where you know a pattern and are just trying to find out whether a single sample matches it, you are instead trying to find patterns shared by a group of genomes associated with a similar med

            • by Immerman (2627577)

              I think the idea is mainly that you don't want to have to completely resequence your genome every time you want to test for something new - after all we could be discovering new medically-relevant genetic properties for centuries, and the company doing the sequencing doesn't necessarily know or care about every potentially interesting finding, so you keep it on file somewhere. If costs continue to follow the current trend the first factor will likely only be an issue for a decade or two, before the price d

              • by tragedy (27079)

                I think what they're really trying to sell in this article is saving everyone's data in a central repository where everyone's DNA could be mined for data without compromising their privacy. That's effectively impossible. The only way to do it would be to perform operations that examine the entire database to produce a sigle result. The required computing power/time would be astronomical under this model. Pretty much every other way of doing it allows you to narrow down a particular patients DNA and extract

                • by Immerman (2627577)

                  Sounds like a good sales pitch, but how would homomorphic encryption enable such an anonymous data-mining paradise? As I understand it such encryption allows you to process the data without decrypting it, but the results are themselves encrypted with the same key. And if you have the key to access the results then you don't need the ability to process data without decrypting it.

                  I would assume that each DNA record has it's own key (otherwise it kind of defeats the point), and that you can't mix the processi

                  • by tragedy (27079)

                    Sounds like a good sales pitch, but how would homomorphic encryption enable such an anonymous data-mining paradise?

                    Well partly by being effectively backdoored from the start. It seems unrealistic to believe there wouldn't be some sort of backdoor from the start to fix things when they break in the large, complex, inpenetrable data set. After things are pretty stable, the developers will be reluctant to get rid of the back door because of the large number of times they would have had to rebuild entirely from scratch if they didn't have the back door, and it will hang around forever. Mostly, however, there's the simple fa

                    • by Immerman (2627577)

                      Why would an honest individual put in a back door in the encryption for "testing"? Just test with data you have the key to. Much simpler and doesn't inherently undermine the integrity of the system you're building. And how can things "break" within an immutable data file? When's the last time you saw a "broken" bitmap or text file that wasn't due to either a failed creation (probably not worth fixing), or corruption of the transmission or storage medium that can be solved with an error-correcting wrappe

                    • by tragedy (27079)

                      Why would an honest individual put in a back door in the encryption for "testing"? Just test with data you have the key to.

                      It doesn't take a dishonest individual. It's just fairly typical in such situations. It depends on who's actually in charge and if they run into problems.

                      Consider that the US nuclear launch codes were 00000000 for two decades. Consider that something like 30 billion dollars a day is spent in credit/debit card transactions based using a system with effectively _no_ security. Consider the failing grade nearly all large organizations receive pretty much every time they are audited for security. Even when thei

                    • by Immerman (2627577)

                      Okay, yeah editing is an issue as well, but not one relevant to archives of immutable data.

                      Fair point about data dumps from focused studies, I'm sure some of them would indeed contain common elements that could open an attack vector, though I don't know how big a vector a few known bits in a 3MB file would actually make. Certainly nothing like having 99.8% be known. It probably wouldn't be racial studies that do it though, IIRC there's not actually any well-defined racial boundaries from a genetic perspecti

                    • by tragedy (27079)

                      Okay, yeah editing is an issue as well, but not one relevant to archives of immutable data.

                      True, but I'm not as confident as you that early versions of this will actually allow for immutable data. Avoiding all bugs that might require things to be re-encoded is a monumental task. Maybe they could pull it off. I would be truly, truly impressed.

                      It probably wouldn't be racial studies that do it though, IIRC there's not actually any well-defined racial boundaries from a genetic perspective - there's not even one single solitary gene shared by most black people that isn't also present in a lot of whites and asians (and vice-versa), we just travel and intermix too much. It only takes one person with a bad case of wanderlust a thousand years ago to introduce a gene into a large portion of an otherwise isolated population.

                      True. It depends a bit on the groups. Island populations, for example. Specific studies on people with a particular medical condition with a genetic link might be better example than people with particular ethnicities.

                      Yeah, I can't argue against the incompetence card. In fact that's why I think homomorphic encryption could be a wonderful thing for genetics - it means that the sequenced DNA need never be stored in plaintext anywhere outside the sequencing machine, not even in volatile memory while being analyzed. There's still the risk that someone gets their hands on both the key and data, but a single "never, ever keep these two things in the same place" security rule would go a long way towards protecting against that, and has at least a chance of being followed.

                      That is true. I think something like that

                    • by Immerman (2627577)

                      What is there to go fixably wrong? You sequence the DNA to a 1.5GB file - if there's any problem in that stage you're hosed already. Then you do a binary diff to your reference sequence - that's a pretty thoroughly mature technology. Then you encrypt it - again, any problems = you're hosed. And if we're working on the assumption that the lab has no access to the data once it leaves the sequencer as a 3MB encrypted file then they would be hard-pressed to fix anything in the data anyway, at most they coul

                    • by tragedy (27079)

                      What is there to go fixably wrong? You sequence the DNA to a 1.5GB file - if there's any problem in that stage you're hosed already. Then you do a binary diff to your reference sequence - thWhat is there to go fixably wrong? You sequence the DNA to a 1.5GB file - if there's any problem in that stage you're hosed already. Then you do a binary diff to your reference sequence - that's a pretty thoroughly mature technology. Then you encrypt it - again, any problems = you're hosed. And if we're working on the assumption that the lab has no access to the data once it leaves the sequencer as a 3MB encrypted file then they would be hard-pressed to fix anything in the data anyway, at most they could reformat it into something more efficient to process, but that would seem a risky undertaking when you have no access the data to verify that you didn't just hose things completely.at's a pretty thoroughly mature technology. Then you encrypt it - again, any problems = you're hosed. And if we're working on the assumption that the lab has no access to the data once it leaves the sequencer as a 3MB encrypted file then they would be hard-pressed to fix anything in the data anyway, at most they could reformat it into something more efficient to process, but that would seem a risky undertaking when you have no access the data to verify that you didn't just hose things completely.

                      Well that's pretty much the point. If the model of the system is so secure that you're hosed if anything at all goes wrong, most people are going to hedge their bets by putting in a back door so they can try to fix things. When you're going to have to tell your clients to redo millions of dollars of really expensive data entry if anything goes wrong, you're going to be under a fair amount of pressure to make sure that doesn't happen. One way to do that is to secretly break your security model. It happens al

                    • by Immerman (2627577)

                      I wish I could argue against your faith, but I've seen too many examples myself.

                      Think of this though - who is the customer for the DNA lab? Individual citizens on doctor's orders. And what exactly happens today if it turns out that there was a problem/something really unexpected with the last set of tests? Seems like mostly Doc sends you to get them done again. As long as that doesn't change with sequencing neither Doc nor the lab has much incentive to have the records around indefinitely, especially if

  • We can't even keep credit card information private, and that's not just a matter of someone else's privacy, it's a matter of actually losing money.

    What hope is there really of keeping your genome private if you are sending it across the internet?
    • by nurb432 (527695)

      Besides the 'internet security issue', its not that hard to get your DNA to test themselves if someone wants it.

      • I was going to mention that, but I wasn't sure. Can you get a full genome sequenced from hair, or do you need a certain quantity of blood or something?
        • by Kjella (173770)

          I was going to mention that, but I wasn't sure. Can you get a full genome sequenced from hair, or do you need a certain quantity of blood or something?

          As far as I can tell you need full cells so hair that has been cut with a scissor no, but if you have a hair follicle pulled out by a hair brush that's enough. Any blood, saliva, semen or tissue sample will also do. a quick check suggests as little as 5 cells are needed so we're talking nanograms of material here.

          • "As far as I can tell you need full cells so hair that has been cut with a scissor no"

            Ah, the 1980s where Lex Luthor can clone Superman from a strand of his hair in Superman IV.
          • a quick check suggests as little as 5 cells are needed so we're talking nanograms of material here.

            Yup. Scientists discovered they could extract your DNA from your fingerprint ~2003. http://science.slashdot.org/st... [slashdot.org]

        • by Immerman (2627577)

          If it's important enough you can get a full DNA sequence from a single cell - DNA was designed to replicate, and it's not that hard to get it to do just that in the lab. If you've got hundreds/thousands/millions of cells then it makes it even easier since you can use "shotgun" sequencing techniques to accelerate the process dramatically. And that's still a pretty small sample - most animal cells are around 10-30um in diameter, so you're looking at 35,000-1,000,000 of the suckers in a 1mm cube sample.

          Blood

      • by Immerman (2627577)

        If they're interested in *your* DNA specifically, no, technological measures won't stop it (though legally requiring licenses to possess gene sequencers and "informed consent" laws in regards to human DNA sequencing would go a long way towards holding back any GATTACA-esque abuses.

        On the other hand, how valuable would a database of thousands or millions of people's unencrypted DNA be?

    • by ubrgeek (679399)
      I'm still chuckling over the use of the words "private" and "cloud" in the same sentence...
      • "I'm still chuckling over the use of the words "private" and "cloud" in the same sentence..."

        Wow, that's a quote that should go on the wall in every corporate board room.....

    • Information wants to be free.
    • So true. But DNA security is more that an issue of privacy. In the near future, understanding the human genome will make possible developing bioweapons targeted at individuals (with collateral damage) as well as bioweapons that could probably kill all humans exposed to the pathogen (like Ebola). We have, up to now, been protected by the obscurity and complexity of the issue. With advanced computers, vast data collection, and improved scientific understanding, creating individual and global bioweapons will

  • What's wrong with AES256 for protecting my Gnome?
  • he said homo
  • Encryption can be broken, especially the kind that exposes useful information about the plaintext as this one does. A much simpler alternative is to keep your genetic information in your own control, processing it on your own computer with open source software. You know, just what we already do with other sensitive information like passwords.

    • It doesn't expose any information about the plaintext. It exposes an interface which lets you manipulate the plaintext. Not the same thing.
    • Re:Keep it (Score:4, Informative)

      by LargeMythicalReptile (531143) on Monday February 17, 2014 @07:20PM (#46271403)

      Hi. I'm a theoretical cryptographer.

      Encryption can be broken,

      Some implementations have been broken. Encryption itself is generally fine (as long as you go with well-studied, standardized methods). There is a point that encryption is always subject to real-world factors, but the most common libraries are pretty good. Whenever you read about a data breach in the news, it's not because encryption was broken--something else went wrong (and, frequently, exposed data that wasn't encrypted in the first place).

      especially the kind that exposes useful information about the plaintext as this one does.

      Homomorphic encryption does not expose useful information about the plaintext, although the article doesn't make that clear. You start with an encrypted input, perform an operation, and get an encrypted output. Only the person with the key--who is not the person performing the computation--can decrypt the result.

      There is a somewhat-related but distinct concept, called "functional encryption", in which one can distribute a key associated with a function f. That key allows a user to take an encryption of x and obtain f(x)--but nothing else about x other than f(x), where "nothing else" has a mathematical formalization. So you could (conceptually) encrypt your entire medical record and give your doctor a key for the function that calculates the probability that you'll have a heart attack in the next five years. Then they'll be able to calculate that probability, but nothing else about you.

      A much simpler alternative is to keep your genetic information in your own control, processing it on your own computer with open source software. You know, just what we already do with other sensitive information like passwords.

      This I agree with, in an ideal world. Will we be living in such a world, 5, 10, or 20 years down the line? I don't know. Right now, the trends are largely in outsourcing everything--more and more, your data and computation live on the cloud. For medical information, your doctor doesn't do all the tests himself--he outsources them to a lab. For genetic information, 23andMe doesn't sell software that lets you analyze your own genetic markers--they take your information and perform the analysis on it themselves. So these trends will need to change before the above takes place.

      It would be great to keep one's own data and get all the various analysis tools via FOSS. But someone needs to write and distribute those tools--as well as make it feasible to obtain one's own data in the first place (I don't know about you, but I don't have an MRI machine in my house). So until that world exists, homomorphic encryption is a potentially useful tool in this area.

      [It also has uses beyond securely outsourcing computation, but that's somewhat off-topic.]

    • Re:Keep it (Score:4, Interesting)

      by Immerman (2627577) on Monday February 17, 2014 @09:21PM (#46272377)

      Right, because I have the knowledge and equipment to sequence my own DNA make sense of the results.

      Sure, encryption can be broken, and I don't know how far I'd trust IBMs 1st-generation homomorphic encryption, much less this "streamlined, high performance" version adapted by medical researchers, but it's a hell of a lot better than nothing.

      Also, while I'm not an encryption expert, it sounds like homomorphic encryption doesn't actually expose useful information (at least not intentionally, I'm sure it opens up some new attack vectors, everything does). Encrypt A to get B. Apply operations f(B) to get C, decrypt C to get f(A). C is still encrypted gibberish.

      So, assuming it's possible to do public/private key homomorphic encryption, my doctor could send a sample for sequencing along with a public key. DNA gets sequenced and encrypted (ideally both on the same non-networked hardware so that the plaintext data is never accessible to anyone), and the encrypted sequence is sent back to my doctor, archived in a public database, whatever.. Doc can then send it to a third-party DNA analysis firm in Nigeria, who perform all manner of analysis on it and send the reams of gibberish test results back. He then calls me in, the only holder of the private key, and I can then decrypt the results on my secure, open-source computer and present them for his interpretation and advice.

  • by LaminatorX (410794) <sabotageNO@SPAMpraecantator.com> on Monday February 17, 2014 @06:18PM (#46270953) Homepage

    I'm trying to say something intelligent involving homomorphic encryption with random seeds and salt that doesn't trigger the Beavis & Butthead reflex, but I just can't make it happen.

  • by Idou (572394) on Monday February 17, 2014 @06:19PM (#46270957) Journal
    If I were not constantly releasing millions of copies of my DNA in the form of dead skin cells everywhere I go. Either my cells need to also adopt this encryption standard, or I need a lifestyle where I am completely self sufficient (including my waste disposal), never having to leave my home.

    Even then, a gust of wind while I am in the backyard might be all that is required one day for someone's reader to catch my DNA and run a simulation to match with facial recognition.
    • Don't give the NSA and FBI ideas.

      A few years back, the Supreme Court ruled they couldn't use IR scanners without a warrant on buildings. Although it was "broadcast" out to common areas, you historically had the expectation of privacy. Yey originalism and intent of Founding Fathers.

      That atitude is dead as a doornail now (not that it wasn't always DOA in government reaching -- hence that case) but now it's even more of a struggle with Congress and the President acquiesing to all kinds of metadata stuff.

      I st

    • by Immerman (2627577)

      I think the point is less to protect your personal DNA sequence, and more to protect the anonymity of databases/sequencing labs/doctors offices/etc. that are otherwise carrying around massive blinking "hack me" signs.

      • by Idou (572394)
        Right, but why go through the trouble and risk of hacking someone else's database when it will soon be cheap enough to sequence directly yourself?

        Obtain a used air filter of a building, and you may have the DNA of anyone who has been in that building for the last couple of days . . . legally.
  • by sckienle (588934) on Monday February 17, 2014 @06:21PM (#46270971)
    I am not a cryptography expert, but I have been supporting genomic medicine for 10 years. For Homomorphic encryption to be of any use in research, or diagnostics, it is necessary to know that each genetic sequence is encrypted to the same results. That is XYZ for person 1 has to be the same genetic sequence as XYZ for person 2. Otherwise we are comparing apples to wood and the results are gibberish. So if XYZ is XYZ is XYZ, how is that any more secure, from a genetic profiling, etc. POV than the raw genetic sequence? It's like saying your SSN is safe, no one will know it is 123-45-6789, we "secured" it as abc-de-fghi but otherwise is just as unique in identifying you. Am I missing something here?
    • by Kjella (173770)

      Near as I can tell, it's simply a way to outsource number crunching. Like for example in a paternity suit, you can encrypt the DNA of the people in question, hand it over to a cloud provider who'll give you a paternity index score but can't recover the actual DNA sequences involved. Okay, not best example. Say you have a huge number of samples like a genetic archive. You want to find "The people with genes XYZ, what other genetic differences do they have from the general population?", so you hand a cloud pr

    • by Immerman (2627577)

      To rephrase AC: using homomorphic encryption you:
      Encrypt A to A*
      Perform analysis on A* to get B* (the gibberish encrypted results)
      Decrypt B* to get B.

      So basically you, as some lab doing the analysis, has *no* idea what the incoming DNA is, nor what the results of your analysis are. All you need know is how to perform the analysis if they *weren't* encrypted. You can then send the encrypted results back to the doctor who sent you the encrypted DNA, and *she* (or the patient in question) can decrypt them to

  • I studied bioinformatics, but I've never understood this illusion of a bunch of goofball scientists toiling away in lab coats somewhere. Modern personal computers are more than capable of doing whatever analysis an individual user might want done. You want expert analysis of your results? Ask a doctor, who is already legally required to keep everything confidential.

A CONS is an object which cares. -- Bernie Greenberg.

Working...