Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Privacy Science

This Impenetrable Program Is Transforming How Courts Treat DNA Evidence (wired.com) 186

mirandakatz writes: Probabilistic genotyping is a type of DNA testing that's becoming increasingly popular in courtrooms: It uses complex mathematical formulas to examine the statistical likelihood that a certain genotype comes from one individual over another, and it can work with the subtlest traces of DNA. At Backchannel, Jessica Pishko looks at one company that's caught criminal justice advocates' attention: Cybergenetics, which sells a probabilistic genotyping program called TrueAllele -- and that refuses to reveal its source code. As Pishko notes, some legal experts are arguing that Trueallele revealing its source code 'is necessary in order to properly evaluate the technology. In fact, they say, justice from an unknown algorithm is no justice at all.'

This Impenetrable Program Is Transforming How Courts Treat DNA Evidence

Comments Filter:
  • about the code!

    • by laurencetux ( 841046 ) on Wednesday November 29, 2017 @12:48PM (#55644459)

      You can't handle the truth [insert full quote here]

      but anyway this should not be admissable in court until the source code (with needed toolchain) has been vetted by the Oldest and Crankiest Qualified persons they can find.

      (does this have a bias as to which "race" it comes up with?? will it pop certain trait markers more often??)

      • Whether it does or not doesn't matter as long as it isn't reviewed, because it will certainly be challenged on these grounds or at the very least you'll see the relevant pressure groups cry foul, whether or not it is.

        This alone means that this MUST be reviewed before even thinking about considering it as admissible evidence.

      • You don't need the source code to determine if it's any good or not. We don't have the source code of the universe, but have been able to make steady scientific process regardless.

        Having access to the source code would no doubt make it much easier to verify, but if they refuse, you can still run a series of studies aimed at testing whether or not the program actually does what it claims. If you have some known DNA samples, you can have the program analyze them in order to produce estimates and see how cl
  • by sinij ( 911942 ) on Wednesday November 29, 2017 @12:00PM (#55644127)
    I think it is very reasonable to ask access, covered by NDA, to a source code when such code is used to produce results for criminal prosecution. Unless they can show independent third-party validation of their tool.

    We have seen issues with red light cameras, we have seen issues with labs doing drug testing on hair, we have seen child abuse panics from psychology "experts". Both methods and experts have to be open for independent, impartial validation. Otherwise they are no better than a duck test.
    • Otherwise they are no better than a duck test.

      I say we go back to the duck test! I keep protests of "this is a witch hunt" and yet we don't even test to see if they float like a duck! How are we supposed to know if we've found a witch if not for the duck test? ;)

    • by grasshoppa ( 657393 ) <skennedy@tpno - c o . o rg> on Wednesday November 29, 2017 @01:11PM (#55644657) Homepage

      I don't think even NDA access is appropriate. How versed are you in probabilistic medical programming? How many people would you say are?

      Hell, even the most experienced developer will need some time to acclimatize to any sufficiently complex codebase, now throw in the specialties on top of it. It's beyond unreasonable to expect any "expert" to have limited access ( both physical and temporal ) to the codebase then be expected to give expert testimony in court on it.

      The only way things like this get properly vetted is via "many eyes", and even that's no guarantee.

      Speaking of experts; let's pretend your some poor schmuck ( literally ), using a tool like this in a case where you can't afford an expert witness ( and it would be beyond pricy I'd expect. I know I'd charge a shitload ) only guarantees a compromised defense.

      No, I can't see any reason why the code shouldn't be publicly available if the tool will be used to help convict people.

    • by jbengt ( 874751 )

      I think it is very reasonable to ask access, covered by NDA, to a source code when such code is used to produce results for criminal prosecution.

      No. That is entirely unreasonable.
      If it's being used as evidence in a public court, it is only reasonable to provide public access to the source code, and specifically to provide access to the methods & computations that produced a particular result in a particular case.
      We are not supposed to have secret courts in the US (FISA notwithstanding).

  • "evidence" (Score:5, Insightful)

    by paai ( 162289 ) on Wednesday November 29, 2017 @12:04PM (#55644151) Homepage

    As Terry Pratchett wrote somewhere: "Evidence means 'that what is seen'". Nuff said.

    Paaia

  • by martyros ( 588782 ) on Wednesday November 29, 2017 @12:05PM (#55644157)

    A lot of expert witness testimony comes down to a judgement call -- "In your opinion, as someone who has been working in this field for 20 years, how confident are you that these signatures / bullet marks / fingerprints / DNA match?" That's the result of an algorithm that you can't examine either, and has at least as much opportunity for being corrupted by unconscious prejudice or outright bribery as a piece of software.

    • by Baron_Yam ( 643147 ) on Wednesday November 29, 2017 @12:11PM (#55644205)

      You have the right to face your accuser, which includes examining the evidence against you. This is secret evidence. It amounts to "because we say so", and should not be tolerated.

      A software bug you're not permitted to look for could send you to jail. At least with a human expert witness you can cross-examine them.

    • Expert judgement can be countered by other experts. Here we are being presented with something as a "Fact". There is no way to dispute it and there is no way to verify it which is what people are having a problem with.

      • by sinij ( 911942 ) on Wednesday November 29, 2017 @12:19PM (#55644279)

        Expert judgement can be countered by other experts. Here we are being presented with something as a "Fact". There is no way to dispute it and there is no way to verify it which is what people are having a problem with.

        Questioning expert's qualifications is fair game in trials. If you can demonstrate that expert is not impartial, you can largely mitigate their testimony.

        How do you question algorithm like if (1) = Guilty; other than code review?

      • by martyros ( 588782 ) on Wednesday November 29, 2017 @12:25PM (#55644317)

        Well it shouldn't be accepted as fact. Ideally the courts would instruct the jury to treat the software's output as similar to a human being saying, "This is my expert opinion." You can submit your own software's "opinion" as evidence as much as you can get your own expert human to testify on your behalf.

        It is true that you can't cross-examine it; but ideally, that should make the software less reliable. If you had an expert who, upon cross-examination, always responded, "I don't know, it just seems that way", then he wouldn't have much credibility. Ideally, software that can't justify its "opinion" should be treated the same way.

        I have said "ideally" here several times, recognizing that it may well be the case that this isn't how people actually think. But I think a more constructive response to this misplaced trust is to help inform courts and defense lawyers more clearly (who should in turn inform the juries).

        • by Dragonslicer ( 991472 ) on Wednesday November 29, 2017 @01:14PM (#55644683)

          Well it shouldn't be accepted as fact. Ideally the courts would instruct the jury to treat the software's output as similar to a human being saying, "This is my expert opinion." You can submit your own software's "opinion" as evidence as much as you can get your own expert human to testify on your behalf.

          One of the requirements for presenting expert testimony is that you have to provide all of the materials that the expert used in forming their opinion. If the results of some software were treated as an expert opinion, the "materials relied upon" would almost certainly include the source code. It may even make the programmers, as the source of those materials, subject to being deposed about how they developed the software.

        • US Constitution, sixth amendment: "In all criminal prosecutions, the accused shall enjoy the right...to be confronted with the witnesses against him; to have compulsory process for obtaining witnesses in his favor....". It seems to me that a device that announces something should have some humans, i.e. witnesses, testifying in its favor, but the courts may not agree.

      • Smart systems should be able to print a trace of their decision-making. If the code is not accesible, the particular instance of reasoning relevant to your case should at least be scrutinizable this way.
    • by Thruen ( 753567 ) on Wednesday November 29, 2017 @12:25PM (#55644321)
      You get to ask an expert witness why their opinion is what it is, and if they answer "I'm not telling," their credibility is shot and there's a good chance their testimony will be thrown out. This software is an expert witness that nobody has any reason to believe giving testimony damning a person and then refusing to explain why but maintaining credibility. Analyzing whatever algorithm the software uses would be like questioning the witness, which is your right as a defendant in the USA, and keeping it hidden is literally denying you that right.
    • What you're describing is exactly why experts frequently have their credibility challenged and why they need to provide the means by which to verify their credentials. The problem here is that they're providing no means by which to establish or confirm the credibility of the algorithm, and they know that doing so doesn't harm them as it would with an expert witness.

      Imagine if the prosecution put an "expert" on the stand who testified how the prosecution wanted, but when the defense attorney asked where the

      • You brought up a good point.
        So the counter would be to write a program that accepted the same physical evidence data and simply returned whatever answer the defense wants.

    • It is different in that you can challenge an expert witness with your own witness. How can you challenge an algorithm that no one really knows? Considering that the FBI has used flawed statistics in DNA matching for a decade [washingtonpost.com], this is not the first time that there are issues with how forensic science is done.
      • Hell, PBS Frontline did a special about the horrors of modern "forensics", titled 'The Real CSI'

        It's an eye-opener.

  • "justice from an unknown algorithm is no justice at all"

    A successful conviction may be legitimately tipped by accurate checked evidence, in this case DNA ...

    But justice is not a matter of technical facticity. It is withholding something from a party that they deserve.

    The evidence may help identify discrepancies between the two, but it is a major conflation to substitute that with justice.
    • by Falos ( 2905315 )

      "Justice" is probably in regard to dispute of "the courts find this blackbox satisfactory"
      as in "the courts find this box sufficient means for achieving their goals"
      as in "the courts find this accomplishes their job"
      as in "the courts find this accomplishes justice"

  • by b0s0z0ku ( 752509 ) on Wednesday November 29, 2017 @12:11PM (#55644201)

    Jurors and judges need to know what the probabilities are. Remember, in a criminal trial, the standard for evidence is "beyond a reasonable doubt." Sending people to prison for life or even to death row based on flimsy evidence is unacceptable.

    This isn't to say that it hasn't happened before -- Cameron Todd Willingham was executed in Texas on the testimony of an "arson expert" with no formal training in the field.

    The code should be evaluated or the tool should be banned from court. The company doesn't like it? Too bad. They don't have to sell to the forensic lab/law enforcement market.

    • The code should be evaluated or the tool should be banned from court. The company doesn't like it? Too bad. They don't have to sell to the forensic lab/law enforcement market.

      Arguably, the program can be evaluated without the source code. Simply use known samples and examine the output. Do the results of the analysis match what was known about the samples?

      This testing would have to be performed by a neutral third party of course.

      • by b0s0z0ku ( 752509 ) on Wednesday November 29, 2017 @12:23PM (#55644307)
        Disagree. Bad code can cause normal behavior 99% of the time, abnormal 1% of the time. See also, THERAC-25.
        • I agree with Thelasko: We don't need to see the source code, we just need to see the results. Evaluate it as a black box -- feed it known samples and see if it produces the correct results. If so, it's reasonable to admit the machine's evaluation.

          What if there's some flaw that affects a small fraction of cases? It's possible, but it's also possible that code inspection wouldn't find the problem either. There's always another test, another set of eyes, another *something* that could be done. At some point

          • by sjames ( 1099 )

            That's how it's SUPPOSED to be, but in practice you won't find a cop or anyone else testifying about the internal state of the breathalyzer or why a sample might cause it to read 0.035. It'll just be 0.02 is the legal limit, he came up 0.035, case closed!

          • by jbengt ( 874751 )

            I agree with Thelasko: We don't need to see the source code, we just need to see the results. Evaluate it as a black box -- feed it known samples and see if it produces the correct results.

            Apparently part of the output is the probabilities that a particular sample is from the suspect or from someone else. How can you feed the program "known samples" that can evaluate that it produced correct results of the probabilities for the particular samples used in the case? (One sample was reported as having "a one

            • How can you feed the program "known samples" that can evaluate that it produced correct results of the probabilities for the particular samples used in the case? (One sample was reported as having "a one in 211 quintillion chance that it originated from someone else".)

              I am not an expert in the field. However, I propose the following tests:

              • Two samples from the same person
              • Samples from siblings. As they should share some DNA
              • Samples from completely unrelated people, proven with ancestry.
              • Samples from the same person, but one of the samples is damaged. (contaminated, small sample, etc.)

              These tests should be repeated many times to form a statistical profile, and then compared to the output of the software.

        • by Falos ( 2905315 )

          Leveson notes that a lesson to be drawn from the [THERAC-25] incident is to not assume that reused software is safe: "A naive assumption is often made that reusing software or using commercial off-the-shelf software will increase safety because the software will have been exercised extensively. Reusing software modules does not guarantee safety in the new system to which they are transferred..."

          I'm sure Thelasko meant well, we hold faith in science driven by observation, but we're not speculating about string theory, we're declaring that someone did, by this court's finding, commit X and deserves to Y. The declaration is different than musing "we consistently observe Z so protons would be Q." in a report, which doesn't actually assert Q is fact.

      • by Dog-Cow ( 21281 )

        The program is useless if you have to predetermine the correct answer for every possible input, and the evaluation is useless if you do anything less.

        • The program is useless if you have to predetermine the correct answer for every possible input, and the evaluation is useless if you do anything less.

          Bullcrap. A pocket calculator has a predetermined correct answer for every possible input. That does not make it "useless".

      • Arguably, the program can be evaluated without the source code.

        At considerable expense. But even then it still is a problem because you would have to do it for every single case. Otherwise you have no way to know if something is different or wrong with the analysis in a case where no verification was conducted.

        Simply use known samples and examine the output. Do the results of the analysis match what was known about the samples?

        You're talking about using controls and/or independent testing methods. Not really good enough because if there is a discrepancy you run into Segal's Law [wikipedia.org] (a man with a watch knows the time and a man with two is never sure). You have no way to know which test

      • by sjames ( 1099 )

        Without the source, it would be hard to assure complete coverage in the sample data, The test would have to be exhaustive.

        That's fine if the company wants to go that way, but of course, exhaustive testing will cost plenty more and make it far less likely to ever be funded. Until that testing happens, the whole technique should have the same legal standing as the magic 8 ball.

      • The code should be evaluated or the tool should be banned from court. The company doesn't like it? Too bad. They don't have to sell to the forensic lab/law enforcement market.

        Arguably, the program can be evaluated without the source code. Simply use known samples and examine the output. Do the results of the analysis match what was known about the samples?

        This testing would have to be performed by a neutral third party of course.

        Oh, like Volkswagen's Dieselgate?

    • >Jurors and judges need to know what the probabilities are.

      And they need to know what the probabilities are pooled from. One of the early problems with DNA suspect testing was that early DNA databases were collected predominately from FBI agents.

      Who happened to be mostly white.

      Which means their gene expression probabilities were different than for blacks. Which meant that the probability of a false positive DNA match on black suspect with a DNA sample from another black person was a couple of
    • Even if the "expert" had received training, it would not have made any difference, because no one had ever done a scientific evaluation of the way a house burns. All the opinions of arson experts were bullshit, full of confirmation bias.

  • by CustomSolvers2 ( 4118921 ) on Wednesday November 29, 2017 @12:12PM (#55644217) Homepage
    One thing is having access to the source code and a completely different story is properly analysing it. When dealing with something as complex as (probabilistic!) DNA sequencing, it seems quite clear that the most sensible way to validate the program is actually using it. Set up a proper benchmark with a relevant number of samples and confirm whether this (+ any other) program works exactly as expected. This would also be an excellent way to objectively assess its accuracy.
    • by sabri ( 584428 )

      One thing is having access to the source code and a completely different story is properly analysing it. When dealing with something as complex as (probabilistic!) DNA sequencing, it seems quite clear that the most sensible way to validate the program is actually using it. Set up a proper benchmark with a relevant number of samples and confirm whether this (+ any other) program works exactly as expected. This would also be an excellent way to objectively assess its accuracy.

      Exactly this. My kingdom for modpoints.

      You don't test software by looking at the code. You test the software by testing it. If it ain't broken, you're not testing hard enough.

      While I'm very pro-OSS, I'm anti forcing private companies to disclose their source code. It is their work, their intellectual property. It's up to the judge to admit the closed-source evidence and up to the jury to weigh it.

      • I'm anti forcing private companies to disclose their source code

        In some cases, seeing the source code might be required, but under the most likely conditions this is a pretty useless formality. Very tough work which is very unlikely to output worthier conclusions than testing.

      • You don't test software by looking at the code. You test the software by testing it. If it ain't broken, you're not testing hard enough.

        Doing a black box analysis of software when the code should be available for review by a defendant is so wrong headed I barely know where to start There is NO place for secret code when it comes to convicting people of crimes. The defendant should be able to question any and all methods being used to accuse them of a crime.

        While I'm very pro-OSS, I'm anti forcing private companies to disclose their source code.

        Tell me that when you are facing a life sentence and you aren't allowed to examine the code being used to send you to jail. If we're talking about a word processor, who cares but when

      • I'm anti forcing private companies to disclose their source code.

        They don't have to disclose their source code. They can choose instead to have it not be usable in court.

        Freedom of choice does not mean freedom from consequences.

        • by stdarg ( 456557 )

          Freedom of choice does not mean freedom from consequences.

          That is what it means actually. Well, more specifically it's about being free from consequences that you don't wish to be subjected to. If you don't have freedom from consequences that you don't want then it's meaningless because you're really talking about free will, not societal freedom. "If you do drugs, we'll throw you in jail as a consequence! We support your freedom of choice in doing drugs!" That isn't useful.

          • If my choices have no consequences, why bother? If my choices can have consequences I like, then they can have consequences I don't like, if only by comparison. This applies when discussing free will or societal freedom. Freedom from consequences I don't want is perforce ineffectuality.

      • You don't test software by looking at the code. You test the software by testing it. If it ain't broken, you're not testing hard enough.

        But you use the code to find interesting boundary cases that need additional scrutiny in testing!
        To properly test software *requires* access to source. Otherwise all you're doing is poking it with a stick to find vulnerabilities.

    • Set up a proper benchmark with a relevant number of samples and confirm whether this (+ any other) program works exactly as expected.

      So the article claims a false positive rate of 1 in 211 quintillion for a particular trial. To test that with a 95% confidence interval we would need at least 600 quintillion samples. Now we're a bit short on people on this planet. I don't think Earth could support this many people so we need to colonize other planets. To make things simple, lets assume the average planet

      • So the article claims a false positive rate of 1 in 211 quintillion for a particular trial.

        I didn't read the article, but that or any other issue doesn't change anything. If you aren't able to define accurate enough conditions to validate the corresponding piece of software, you would fail to do so anyway. Testing is much more likely to be quicker and more efficient than the alternative approach of analysing the code. Or do you think that by having access to the code you can guess what might be the output under so extreme conditions? If this was so, what would have been the point of having a piec

        • Actually, GP is correct if we're resorting to empirical testing. We would want about six hundred quintillion samples to test against to verify that. To say that the chance is one in 211 quintillion rather than one in 211 quadrillion, which is three orders of magnitude difference, we'd have to have enough testing to show that the error rate was less than one in 211 quadrillion, which means that we'd have to have enough samples so that the failures were significantly less than one in 211 quadrillion. That

          • Actually, GP is correct if we're resorting to empirical testing.

            Not even in that scenario. Even in case that you carried those 211 quintillion tests out, it wouldn't represent a reliable validation of the claim "1 in 211 quintillion" because just one empirical confirmation isn't statically significant (and this is, from the point of view of that claim, what performing the whole 211 quintillion test once would mean). If you want to go down such a ridiculous unnecessarily over-working path and you want to do it properly, you would have to rely on a much better methodology

          • That one we might manage to verify by testing samples from a mere half billion people against each of the other half billion. We leave the problem of getting that much blood out of each test subject as an exercise for the reader.

            Good idea. If we assume these are independent trials then it's much more feasible :) We can even do more than two people. An experiment could be you got the perps DNA and a mix of 5 other samples. Now can you detect whether or not the perp is in the mix. Also I'm not worried ab

        • Sorry I couldn't help myself. I figured you didn't read the article, and the ridiculous claims TrueAllele made. Human error for DNA testing has been measured to be around 1 in 200, so these tiny probabilities are just dangerous theatrics. Still it's an interesting challenge to estimate extreme probability values. I was half hoping you'd shut me up with some nice technical way around the problem...

          As for empirical testing, it makes sense as part of a larger system of evaluation. Looks like they have

          • I was half hoping you'd shut me up with some nice technical way around the problem...

            Impressive 180-turn attitude change! Well, as answered to other commentator right now, I am personally a fan of approaches on the lines of multiple attempts + averaging the results for proper empirical validation. For example, a way to confirm/dismiss/improve that much more realistic 1 in 200 estimate, I would go with 10 sets of tests up to either 200 or the second error. So, if in the first set, you get the second error at the 150 attempt, you stop there; if in the second set, you reach 200 without a secon

  • by Anonymous Coward

    I've done some genomic work, during the Human Genome Project. I had to step away from the work due to my concerns about the lack of quality. The analysis software of the data, to assemble longer genesic fragements for testing and verification, was so very very poor that all the scientists learned to ignore the analysis and order longer sequence manually, by eyeballing it with their personal experience. It was hideously expensive to do this constantly, especially with the amount of sequences to sample and te

  • Code for "facts" used in the courtroom hidden? Oh, you mean like how voting machine software and hardware design is often not available to the public for examination. All of it, anything on which democracy is contingent, needs to be published. No ifs, ands, or buts. Probably also applies to the code used in killer bots. The populace will need to know how a kill decision is made.
    • The voting tabulators my state uses can be not available for public examination, because we check against hand counts.

      If killer bots are used in warfare, the public doesn't need the details. If they're used in police work, it darn well does.

  • Breathalyzers are effectively closed source under trade secret protections and we've convicted lots of people with those.
  • by FeelGood314 ( 2516288 ) on Wednesday November 29, 2017 @01:01PM (#55644571)
    DNA probabilistic methods like this can do 3 things but can only be use to do one of them at a time. They can eliminate an accused, they can can eliminate all but one person from a predetermined sample of people to find the guilty person, or they can give the police a potential list of suspects. They CANNOT be used to do both of the last two. If I have a small partial DNA sample there will be multiple people in the world that it will match. If the police then just round up the first person that they find who matches and say oh the probability of a match this close is one in 300 million. Well no, if there were 300 million permutations and you looked in a population of 300 million people I would expect you to find a match (well at least 1 -1/e times) .
    • If the false positive rate is 5%, it is still a useful investigative and prosecutorial tool, even though in any moderately sized city you are going to get a LOT of false positives on DNA alone.

      For example, if you can place the suspect in the vicinity of the crime and a motive for the crime along with DNA evidence that says >=95%, then that's a solid conviction.

      However, if you are trawling 23andme.com or Ancestry.com databases for matches, and then just grabbing and prosecuting the first match, you'
      • by Cederic ( 9623 )

        I disagree. I'm fine with using DNA to eliminate a suspect ("This DNA can't possibly be that person") and with furnishing the police with a list of interesting people to further investigate.

        I don't believe though that DNA should ever be used to convict someone. There may be very few false positives but there will be some, and that's too many for it to be reliable in court.

        The whole forensic process is flawed, the DNA analysis is rarely uncontaminated, the subset of DNA markers is too small, the risks of fal

        • >I don't believe though that DNA should ever be used to convict someone.

          I think we are saying the same thing more or less, in different ways. DNA evidence should be used as a diagnostic funnel. But 95% certainty is usually sufficient in court to secure a conviction by a jury or a judge - whether or not you or I agree with those odds.
        • Absolute proof is not required for conviction; the standard is proof beyond a reasonable doubt. DNA can be used as part of the evidence for that. I personally would never vote to convict based on it alone, but then I'm the sort of guy who gets to be a peremptory strike from the jury box.

    • If I have a small partial DNA sample there will be multiple people in the world that it will match.

      No way. Does that mean there are multiple evil twins in the world I've never met?

"Why can't we ever attempt to solve a problem in this country without having a 'War' on it?" -- Rich Thomson, talk.politics.misc

Working...