typodupeerror

## Visualizing False Positives In Broad Screening365

AlejoHausner writes "To find one terrorist in 3000 people, using a screen that works 90% of the time, you'll end up detaining 300 people, one of whom might be your target. A BBC article asks for an effective way to communicate this clearly. 'Screening for HIV with 99.9% accuracy? Switch it around. Think also about screening the millions of non-HIV people and being wrong about one person in every 1,000.' The problem is important in any area where a less-than-perfect screen is used to detect a rare event in a population. As a recent NYTimes story notes, widespread screening for cancers (except for maybe colon cancer) does more harm than good. How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?"
This discussion has been archived. No new comments can be posted.

## Visualizing False Positives In Broad Screening

• #### It's not that hard. You can smell the bleach. (Score:2)

You mean.. sometimes these broads really are blond?

• #### Simple (Score:5, Insightful)

by Anonymous Coward on Wednesday July 22, 2009 @08:35AM (#28780755)

How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?

Hmm, teach them statistics?

• #### I can offer up a nice book on that (Score:5, Interesting)

on Wednesday July 22, 2009 @08:47AM (#28780907) Homepage Journal

I hate math, always did. I was good at it but just could not stand it. As such I skipped out on about anything math related beyond algebra (college level). Didn't impede my programming ability at all.

Still there are times where I like to learn how stuff works and honestly this series of books, Manga Guide to ......, has given me a quick leg up on a few subjects I would never have gained from traditional text books.

• #### Re: (Score:3, Insightful)

Yep, pretty basic fact in probability theory. A test for some condition must fail on less people (by an order of magnitude) then the number of people with that condition. Otherwise, you can pretty safely assume a positive is a false positive.
• #### Re: (Score:2, Funny)

The problem is motivating them to learn statistics.

I recommend accusing them of terrorism and sending them to a black site, where they'll be kept in a blank white cell with nothing but a statistics textbook. Don't release them until they can demonstrate that they are overwhelmingly likely to have been a false positive.

That'll learn 'em.
• #### Re: (Score:2)

Hmm, teach them statistics?

Easier yet, just say something like "Eighty percent of the time it works EVERY time." And then sex your audience up.

• #### Re: (Score:3, Insightful)

Much simpler... use a Venn diagram.

Let Circle A be the traveling public.
Let Circle B, intersecting circle A, be terrorists.
Let Circle C, within Circle A, but intersecting Circle B, be the set of those who the test identifies as terrorists.

Any person in Circle C but not in Circle B is a false positive.
Any person in Circle B but not in Circle C is a false negative.

Vary the location and size of Circle C to demonstrate tests of varying accuracy.

This works for terrorists, for cancer, for any test, really
• #### Re: (Score:3, Insightful)

Let Circle C, within Circle A, but intersecting Circle B, be the set of those who the test identifies as terrorists.

Don't forget: it's possible that B and C don't overlap at all.

• #### Re: (Score:2)

Good point. I guess that's one of the ways we need to vary the size and location of C.
• #### Re: (Score:3, Insightful)

Hmm, teach them statistics?

Okay, so back to the main article's question - how do you teach them statistics?

I work with a lot of biologists and other people who don't have a clear understanding of probability theory, statistics, etc. and one thing that I've found works very well is to make very clear analogies to simple probabilistic systems that they can understand.

For example, going back to the 90% effective test, imagine that you have a wheel with an arrow on it which is the test. On this wheel, t

• #### Re: (Score:3, Interesting)

Well, that example sorta fails because there are no actual terrorists in it.

A more sane example might be to get a group of a few hundred people, in, say, a auditorium, give them an envelope with a sheet of instructions, and a 20-sided die.

The instructions tell them if they're a terrorist or not, and tells them to roll the die and, depending on what it lands on, go to a specific labeled area. I.e., everyone goes to an area, differently, depending on what they rolled. You need about 20 areas.

Terrorists, of

• #### Re: (Score:2)

Wonder knowing statistics help? The assumption here is that people weigh risks and probabilities to maximize their expected return. This is only true indirectly, if at all. What motivates people to action (such as paying for and submitting to airport screenings, or getting medical checkups) is fear. It is not about running numbers, it is about assuaging an uncomfortable emotion. That is why public service announcements show an egg representing your brain sizzling on a frying pan rather than running a s
• #### Re:Simple - the power of celebrity (Score:3, Insightful)

Too hard, you can't even teach most of ''em basic arithmetic - let alone something abstract.

The simplest way to get a message across to the "masses" is simply to have a celebrity deliver it. No explanations, no demonstrations. Simply a script that says: "you know me, I'm that nice, trustworthy person from <name of popular programme> so you know when I speak, I'm telling the truth ..."

People tend to trust individuals they know, they "know" the characters on TV - even though they are actors and prob

• #### Rare events. (Score:2)

The problem is important in any area where a less-than-perfect screen is used to detect a rare event in a population.

Such as "Who's a terrorist?"?

• #### Re: (Score:2)

Such as "Who's a terrorist?"?

That's the first question a terrorist would ask.

• #### Second opinion (Score:3, Informative)

on Wednesday July 22, 2009 @08:39AM (#28780791)

While it's true that there will be false positives, as well as false negatives, you don't convict someone, or have a lung removed, without further testing. When I was diagnosed for cancer, I was tested and re-tested to verify that there was, indeed, cancer. The same goes with screening for terrorists, or anything else. Did the article mention the rate for false negatives as well? After all, if you have a five pound tumor hanging off you face, and your doctor tells you there's nothing wrong, I'd definitely want a second opinion!

• #### Re:Second opinion (Score:5, Insightful)

on Wednesday July 22, 2009 @09:01AM (#28781113) Journal

There's a second opinion in the US when you're put on the no fly list? Or in the UK, when you're detained without charge for weeks (the Government wanted three months)?

The point is that idiots as described in the article think that a "90% scanner" means 90% probability they are guilty, and use to urge action on such people without further checks. And even in a court of law, the point being made is still important: imagine the prosecution telling the jury that the fingerprint/DNA test is 99.99% accurate, therefore he must be guilty? In other words, these further checks are useless if they also fall on the same flawed statistics.

You're okay with your medical analogy, because most doctors have an understanding of basic statisics - unlike the police, politicians, and random members of a jury.

• #### Speech Recognition (Score:4, Insightful)

on Wednesday July 22, 2009 @08:41AM (#28780827)

That's easy, just tell them that the screenings work about as well as speech recognition. It's 95% accurate and everyone knows how much it sucks.

• #### Re:Speech Recognition (Score:4, Funny)

on Wednesday July 22, 2009 @09:50AM (#28781829)

That's easy, just tell them that the screenings work about as well as speech recognition. It's 95% accurate and everyone knows how much it sucks.

What R you toking about, Is peach recognitions the best since sly St.Bread?

• #### Not possible, at least for now (Score:2)

How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?

You'll know when your people are ready for statistics. . . don't even bother trying until state-run lotteries go broke for lack of players.

• #### Re:Not possible, at least for now (Score:4, Insightful)

on Wednesday July 22, 2009 @09:54AM (#28781873) Homepage Journal

You'll know when your people are ready for statistics. . . don't even bother trying until state-run lotteries go broke for lack of players.

Er, not really. The usual cost-benefit, expected-payoff analysis doesn't really work when you're talking about extreme examples like winning the lottery, at least not with huge payoffs measured in tens or hundreds of millions of dollars. You can know, perfectly well, that the ROI on a lottery ticket is less than the cost of the ticket, and still consider it a perfectly rational investment.

If I buy \$150 worth of groceries and throw in a \$1 lottery ticket on top of it, the effective cost to me is zero. I'm never going to notice that dollar being gone. Not having that dollar is going to make no difference to my life. But in the (exceedingly unlikely, yes) event that I win a \$100 million jackpot, the payoff is damn near infinite. Having that kind of money can't really be compared to, say, getting a raise, or seeing your stocks go up in the market. It's just on a whole different scale.

So in short: infinity - (0 * 10^-9) = infinity. Don't assume that everyone who buys a lottery ticket is ignorant. Actually, I suspect most people who buy lottery tickets are making this kind of calculation, even if they're not doing the numbers quite as explicitly.

Here's an example in the opposite direction, which I think will make things a little more clear. Suppose I were to set up a "reverse lottery," which works as follows. You have, let's say, a net worth of \$100,000. If you sign up for my lottery, I pay you a dollar. Then you pick six numbers between 1 and 10, I draw six balls out of urns, and if the numbers match ... I take everything you own. Your house, your car, your computer, the clothes off your back. You're turned out on the street.

In probabilistic terms, it would make perfect sense for you to play. 1 - (100000 * 10^-6) = 0.9, which means that the game has a positive expected payoff. In fact, it would make sense for you to play a lot, up to whatever limit is allowed, let's say once a day. But would you do it? I kind of doubt you would, because every day, you'd be looking at that one-in-a-million chance of having your life shattered. Most people would consider that a bad risk, no matter what the raw numbers say. And people who play the lottery consider it a pretty good risk for the same reason.

• #### Education education education (Score:2)

Quick fixes for overweight people, or diet and exercise? Which do you think works?

What makes you think there's a quick fix for ignorance?

People who are not educated in statistics have only one solution which will work: Get educated in statistics.
That doesn't mean taking a maths degree. Any number of books or basic courses could help.

The problem is that most ignorant people don't want education.
And the really ignorant ones don't even believe that there's such a thing as being smart.

Just look at the number of

• #### Statistics (Score:2)

It's obvious that the person telling us about statistics doesn't understand statistics.

On the subject of terrorism, why not simply arrest everybody, just to be sure...?

• #### Not the first time this has been done. (Score:2, Interesting)

Though it may be the first time that people are trying to draw general population attention to it. I believe the first place I saw this sort of concept revealed was by Cory Doctorow. Though the below article isn't necessarily where I saw it, it recants the same message.

http://www.guardian.co.uk/technology/2008/may/20/rare.events [guardian.co.uk]
• #### A box (Score:5, Informative)

on Wednesday July 22, 2009 @08:51AM (#28780963)

Back during the TQM fad they'd make this point by giving everyone a clear plastic box with 10,000 little balls in it. There was a cribbage board like affair in it, with 1,000 holes, such that by inverting and shaking the box, then turning it upright, 1,000 of the balls would settle into the holes more or less at random, but still be visible through the clear box. The balls were color coded -- 10 red balls, 40 black ones, 50 blue ones, and the rest white. The odds of getting no red and no black are lower than 1%, contrary to most people's expectations.

This was used to drive home a point about the difficulty of "testing in quality" (quality tests suffer false negatives and if there are, say, 1000 such individual measurements on a piece of machinery it's nearly impossible to ship a machine without at least one thing wrong unless the tolerances are well controlled at the point of manufacture). The same idea works any time you want to illustrate the effects of low-incidence events on a large population.

I've always wondered how much injustice is perpetrated by drug screening on large populations, since false positives do occur and statistically must occur twice in a row at least some of the time, which is the threshold considered conclusive proof of abuse by most employers and the courts.

• #### Re: (Score:2)

I've always wondered how much injustice is perpetrated by drug screening on large populations

Phew! For a minute there I thought you were talking about drug screening of professional athletes for performance enhancing drugs. In that case there are so many juicers, it's the false negatives you have to worry about :)

• #### Re:A box (Score:4, Informative)

on Wednesday July 22, 2009 @09:33AM (#28781543)

I've always wondered how much injustice is perpetrated by drug screening on large populations, since false positives do occur and statistically must occur twice in a row at least some of the time, which is the threshold considered conclusive proof of abuse by most employers and the courts.

this very problem came up in England a while ago, but for SIDS deaths. If I recall correctly, some statistician testified at a murder trial as to the infinitesimal chance that a mother would have two infants die from SIDS separately. In fact, granted the size of the population, it is not unlikely for two SIDS deaths to happen to some mother somewhere in the country. Perhaps someone else can remember the details. here's an article but it's not free: http://bjps.oxfordjournals.org/cgi/content/abstract/axm015v1 [oxfordjournals.org]

• #### Re: (Score:3, Interesting)

I read the court documents for the trial a while ago. There were two issues:

1. They quoted the statistics assuming the deaths were independent (i.e. the squared the probability of one SIDS death). The error of this was pointed out.
2. No one mentioned the prosecutors fallacy.

In the end the jury were won over by an argument along the lines of: "Ignore the statistics. You *know* it's really unlikely that these were two SIDS deaths.".

http://en.wikipedia.org/wiki/Sally_Clark [wikipedia.org]

• #### Sally Clark (Score:4, Insightful)

on Wednesday July 22, 2009 @11:08AM (#28783017) Homepage Journal
That may have been the Sally Clark case, although there were others. http://www.independent.co.uk/news/false-statistic-may-have-led-to-solicitors-murder-conviction-1135231.html [independent.co.uk]

I think that's called the "prosecutor's fallacy." If there's a 1/10,000 chance of a child dying of cot death, and a woman has two children die of cot death, the prosecutor tells the jury that the chances could only be 1/10,000 * 1/10,000 = 1/100 million that both deaths were a cot death, so she must have murdered them.

This only works if the deaths are statistically independent, which they're not. The parents could have a genetic defect which cause 2 successive infants to die.

If each parent had 1 fatal recessive genetic defect, then 1/4 of their children would die, so the odds are 1/16 that two successive children would die. But actually a lot of fatal birth defects are more complicated than that simple mendelian pattern.

It's even more complicated because some mothers have been captured on video trying to smother their children.

• #### Re: (Score:3, Interesting)

The same idea works any time you want to illustrate the effects of low-incidence events on a large population.

XKCD can be used to illustrate this too [xkcd.com]

Mouse-over: "You can do this one in every 30 times and still have 97% positive feedback."

• #### the covering of asses (Score:2)

Not sure about other places so I'll speak for the US here. If you could have detected something (terrorist, cancer, etc), but didn't, you'll get taken to court for it. If you run tests you're covered no matter what the results. If you didn't, you're negligent and will spend the rest of your life selling paperclips out of a cardboard box. As an off-topic aside, these ridiculous lawsuits are why healthcare in the US is as bad as it is and no administration is going to admit that or fix it because putting law
• #### Understanding efficiency (Score:2)

Better yet, how can efficiency be explained to wannabe nerds? If a test is very fast, non-intrusive, and cheap with a 90% accuracy, it is a great test. Those 10% may be sent to a further test that is longer, more intrusive, and more expensive with a 99.999999999% accuracy. This applies throughout testing of all kinds. There is no reason screening for terrorists should be a magical area of testing separate from the rules that govern all other areas of testing.

• #### Re: (Score:3, Informative)

Please do not simply press the "9" key until you get bored when it is more readable and more accurate to use words like "very accurate".

For example, it would be difficult to empirically measure the stated accuracy of your test, since it's inaccurate 1 time in 100 billion.

This message has been brought to you by the Society for the Elimination of Superfluous Quantification.

• #### Someone has already done this (Score:2)

Here you go: http://www.bmj.com/cgi/content/short/327/7417/716?fmr It is titled: "Understanding sensitivity and specificity with the right side of the brain "
Exactly written for the purpose. PDF should be available freely.

• #### Article perpetuates the problem (Score:5, Insightful)

on Wednesday July 22, 2009 @08:55AM (#28781023) Homepage
The article itself started out by oversimplifying the test. It would be an astounding coincidence if the test had both a 10% false-positive and a 10% false-negative rate. In fact, any normal test has a very different false-positive and false-negative rate. People who describe the test should mention both, not this meaningless "90% accurate" number.

The BBC article, while claiming to want to reduce confusion, actually perpetuates the problem by using the meaningless "90%" number instead of the specific positive and negative failure rates. If every article describing tests would quote both failure rates, that would go a long way to getting people to understanding the situation.
• #### Screenings do more harm than good? (Score:2)

Yeah, and seatbelts cost more lives than they save.

It may cost more money to institute screening programs and chase false positives, but cancer survivability numbers have been increasing steadily for the last forty years BECAUSE of early detection, not in spite of it. In fact, I remember a statistic from a few years ago that showed more people were surviving all cancers in straight numbers, not just per capita. That's some increase!

I'll take the possible side-effects of false positives over the known reperc

• #### Re: (Score:2)

It depends on the screening test. Some are very good - like colonoscopy. Others can cause more harm than good on average - PSA is a great example.

Early detection is absolutely the way to go, how to do it for any given condition is the hard part. Many cancers have no screening tests because it is a hard to develop good, quality tests (good sensitivity and specificity). Just testing because it can be done is usually a very bad idea.

• #### Re: (Score:2)

Putting on a seatbelt doesn't make you think there's a good chance of you dying for months on end and have to undergo lots of unpleasant examinations and tests.

A positive cancer test has major implications and ultimately, the average well being of a patient is as important a factor as the lives saved.

To use a theoretical example. Imagine a doctor discovers that if they tell a mother giving birth that the baby has died in the womb, the reaction from the woman giving birth somehow lowers the chance of a
• #### Re: (Score:2)

I don't think that's a salient example, since knowing that a child is stillborn in the womb can be determined with certainty. Cancer cannot, usually, and I think 100 false positives - where in most cases the doctors and technicians will be somewhat cagey in their language with the patient - in order to get 1 actual positive is worth it.

To put it another way... Yeah, and if my aunt had balls, she'd be my uncle.

• #### Granted (Score:5, Insightful)

on Wednesday July 22, 2009 @09:09AM (#28781187)

The given version of "terrorist" is arbitrary and thus subject to change over time - from people who hijack planes with guns and explosives, to apparently nowadays, Iceland [nytimes.com], however I think that if you're starting with a number of 1 in 3000 you are so far from reality anyway that what you really want to do is harass innocent people.

Let's look at ALL the hijackings from 1970 to 2000 [bts.gov], a total of 924 hijackings. I couldn't find more recent figures quickly, but let's assume that hijackings have continued at a rate of around 30 per year (the average from 1970-2000), that would add another 30 * 9 = 270 hijackings, for a total of 1194 ok I will be generous 1200 hijackings.

Now let's assume (and this is a BIG assumption - I am again going to be very generous) that TEN people, (the terrorists), board the plane for EACH hijacking event. So now we have 12,000 terrorists.

Now let's just look at the passenger data for the LAST YEAR ALONE for the top 5 airlines [iata.org]. They carried last year 420 million people. LAST YEAR. Now assuming that since 1970 till today there have been a total of 12000 "terrorists" (a VERY generous number), when you divide 420 million by that, you would be looking at 1:35,000 people being a "potential terrorist". However do remember that I am only including passenger data for ONE SINGLE YEAR. Assuming again a 90% accuracy, you are still wrongly intimidating well over 3500 people.

If I was to go through year by year and gouge up the billions of people that have been transported by air, the actual chances of the person being screened actually being a terrorist drops to almost zero.

I will not argue against the value of security as a deterrent. However I think that airport security employees should be well aware that they are, more likely than not, harassing innocent people. Therefore all the excessive bullying, posturing, abuse, privacy and rights violations are completely unnecessary in this context. Airline terrorism is NOT a real threat, be it ever so dramatic on the few times when it does happen. Use technology to screen for the obvious, and lock the god damned cockpit door with a solid lock, for the not so obvious.

• #### Re:Granted (Score:4, Interesting)

on Wednesday July 22, 2009 @09:13AM (#28781243)

addendum - I shouldn't have hit submit yet sorry

Where it says "Assuming again a 90% accuracy, you are still wrongly intimidating well over 3500 people." I should add "per group of 35,000". 10% of 420 million passengers per year is 42 million people per year being harangued for no reason at all.

• #### screening tests in medicine (Score:3, Insightful)

on Wednesday July 22, 2009 @09:11AM (#28781209)

"The problem is important in any area where a less-than-perfect screen is used to detect a rare event in a population"

Unfortunately, there is no such thing as a perfect screening test for anything in medicine. Some are better than others, but none are perfect. This is a very difficult concept for most people, unfortunately, and for many insurance companies.

It is not such an issue for the better screening tests such as colonoscopy but it is very difficult for things like PSA where there is a large body of evidence it can do more harm than good on average if used routinely even within the recommended ages. For a patient, you're lucky if you can have a meaningful discussion in 5-10 minutes which is an awful large chunk of an office visit that usually has >4 talking points.

It is a problem for doctors and insurance companies because some well intended person with the insurance company will decide to measure the quality of its doctors (which I support in theory) by measuring, for instance, the percentage of age and gender appropriate patients under the care of a given physician that have their PSA checked annually. The problem is, there is absolutely no concensus in medicine that it should be checked regularly as a screening test. I'm not sure I want mine tested when the time comes around unless my family history changes between now and then. So to measure a physician by this marker or other screening tests is fraught with problems, since many patients might opt out for very good reasons. Also, I'm not going to recommend any test because an insurance company wants me to, only if it is right for any given patient.

Bottom line is there are no perfect tests and testing is not always the right thing to do. Most people do not understand that because it is a hard concept to grasp.

• #### Infographics to the rescue (Score:3, Informative)

on Wednesday July 22, 2009 @09:19AM (#28781343) Homepage Journal

Draw a picture. People's visual intelligence is much higher than their literary/verbal intelligence. Descriptions in words are difficult to understand when the meaning of the words being used is not clear, uses domain specific jargon (such as 90% accuracy in relation to statistics) and especially when it requires that the recipient of knowledge perform a mental calculation or solve a mental equation.

An effective picture would be one of a thousand people (stick figures or silhouettes will do) with 10 positioned in front. A caption over the 900 in the big group would say "Tested Negative (These people are NOT Terrorists), the caption under the 10 in front would say "Tested Positive (These people may or may not be Terrorists - We don't know)".

Then ask people how they would feel if they were in the group of 10 and were going to be shipped off to a military holding cell to await further investigations.

• #### Re: (Score:2)

BTW I didn't do any math... I'm bad at math, so someone who knows the actual numbers should provide them to someone who can make an infographic ;-p

• #### Visualizing a false positive is easy... (Score:2)

He's dark-skinned and his name is Mohammad.

(Wait, are we talking about airport screening or HIV testing?)

• #### Broad Screening... (Score:3, Funny)

on Wednesday July 22, 2009 @09:23AM (#28781397)

I think they prefer the term "mammogram".

• #### Nice article. -ish (Score:5, Informative)

on Wednesday July 22, 2009 @09:26AM (#28781431) Homepage

I think they totally forget that there is ALSO a 10% possibility that you _don't_ detect the terrorist...

Watch this TED : http://www.ted.com/talks/peter_donnelly_shows_how_stats_fool_juries.html [ted.com]

• #### Knowing Their Ass from Their Al Qaeda (Score:2)

How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?

Just anal-probe every American at football games and airports, then tell a random 9.99% of them they've got AIDS from the procedure (though they don't).

We'll develop an intuitive sense of empathy for falsely accused Muslims, other turban-wearers and lots of other people "with nothing to hide".

• #### subject (Score:2)

"How can this counter-intuitive fact be communicated effectively to people unschooled in statistics?"

With a baseball bat?

• #### One possible solution (Score:3, Interesting)

on Wednesday July 22, 2009 @09:36AM (#28781585)
Switch around what the percentage means: instead of 90% meaning there is 90% chance to successfully ascertain whatever you're screening for, make 90% stand for the analogy of LD50 (Lethal dose for 50%+ of the population). So the screening method would be SE50 (screening effective 50%) if the number of positive cases correctly detected are 50%+ of all positive cases detected.
• #### Think you understand these things? Try this... (Score:2)

A family with two children is chosen at random from a large population.

If I tell you only that they have at least one daughter, what is the probability that both children are girls?

Most people can get that one (it's 1/3), but fail miserably on this question:

If I tell you only that they have at least one child named Mary, what is the probability that both children are girls?

Assume the obvious: the boy/girl ratio is 50-50 and only girls are named Mary.

Most people insist that this is the same question with the

• #### Re: (Score:3, Informative)

A family with two children is chosen at random from a large population.

If I tell you only that they have at least one daughter, what is the probability that both children are girls?

Most people can get that one (it's 1/3), but fail miserably on this question:

You are incorrect. Your statistic would be true if we were randomly picking family with two children until we came across one with (at least) one girl. There's a 1/3 odds there we'd pick one with two girls, and 2/3 that we'd pick one with just on

• #### How it fits in with an overall decision strategy? (Score:2)

Well, I wouldn't expect most people to intuitively grasp Bayesian statistics without some formal introduction to the subject. Nevertheless, it must be remembered that as part of a decision algorithm for detecting terrorists/cancer/whatever, a not-so-accurate test can be useful as a first step. Specifically, if said test is minimally bothersome, cheap and permits us to apply a more costly and accurate strategy to a limited number of individuals at a further stage. For example, a metal detector stops about 3

• #### "Unschooled" in security (Score:2)

It's good to inform people who don't understand statistics. On the flipside, here are two points for people unfamiliar with security:

1. A broad screening for "terrorists" is not made with the expectation that every person flagged is a terrorist. Rather, it identifies behaviors that make a person worth giving a second look. If properly conducted, the flagged person is not treated or considered a threat during the second or even the third look. The 300 people you mentioned would almost certainly be treat

on Wednesday July 22, 2009 @10:34AM (#28782525) Homepage
A nurse recently told me a test was never wrong, it was 99% accurate. I asked how many people she had used the test on this week, she said about 50 a day. Without any knowledge of the population % of positives, and making the gross assumption that it was 99% false positive and 99% false negative, that would lead one to believe that she seeing incorrect results about 2-3 times a week. This had simply never occurred to her. Never mind the population statistic, nor the possible difference between false positive and false negative, but she understood 99% accurate to be - "never wrong". It never even occurred to her that if she was testing hundreds of people a week that some results would be just plain wrong. I didn't ever bother trying to explain the effect of population statistics.
• #### Also a problem for car efficiency, other ratings (Score:4, Insightful)

on Wednesday July 22, 2009 @11:30AM (#28783305)
Math has a way of warping almost anything. Take the miles per gallon rating we use in the US to tell us how efficient our cars are. Miles per gallon is actually a very misleading measurement. [nytimes.com] What we should probably use is gallons per mile, or gallons per 100 miles.

Take an example where a Range Rover gets 14 MPG, a Toyota Rav4 gets 24 mpg, and a Prius gets 46 mpg. It isn't intuitive based on the miles per gallon, but moving from the Range Rover to the Rav4 saves more fuel than moving from the Rav4 to the Prius. That is because people don't drive a fixed number of gallons, but drive (more or less) a fixed number of miles. When you look at the gallons used per 100 miles it is clear. The Range Rover uses 7.14 gallons per 100 miles, while the Rav4 uses 4.17 and the Prius 2.17. So it is clear that changing from a Range Rover to a Rav4 will save almost 3 gallons per 100 miles, while changing from a Rav4 to a Prius only saves 2 gallons per 100 miles.
• #### Change the terminology (Score:3, Interesting)

on Wednesday July 22, 2009 @12:04PM (#28783839)

Instead of talking about false positives and negatives and dependent distributions (which fly right over the head of the average joe), boil it down to the "amplification power" of the test. A random person "presumed innocent until proven guilty" has a chance of 1/3000 to be a terrorist. If you apply your 90% test, people failing it will be terrorists ~1/333 of the time. So the test as an "amplification power" of ~9x. Now everything becomes intuitive. You are looking for a 1-in-3000 needle in a haystack with an amplification power of ~9x, you now need to look for a ~1-in-333 needle in a haystack. The term "90% accuracy" doesn't appear anywhere to confuse things, and it is something everyone can easily grasp. And yes, I know, this ignores the terrorists false negatives; for that you say the test has a "miss rate" of 1/9 so about 1 in nine terrorists will slip through. These three numbers - (1) how rare what you are looking for is, (2) what's the "amplification power" of the test, and (3) what is the "miss rate", give you enough info to intuitively convey all you need to get a good feel for how effective the test really is.

• #### You Need the Full Confusion Matrix... (Score:4, Insightful)

on Wednesday July 22, 2009 @12:10PM (#28783925) Journal

... and a utility function too!

The article is confusing because it doesn't indicate the false negative rate. You basically need to know the entire confusion matrix before inferring anything. This way, you can not only calculate the accuracy and the false positive rate, but you can also calculate the false negative rate, the precision and the recall. Precision and recall are much more useful metrics than recall when it comes to tests like these.

Also, you need to know how much it really costs you to have false negatives and false positives. If you accuse someone erroneously of being a terrorist, and the only inconvenience is a few extra minutes of body search (and the humiliation) at the airport, it *might* still be worth the trouble. If on the other hand you end up sending the poor dude to jail, and he sues you for wrongful conviction, then not so much. You therefore need to have a utility function that assesses the cost of getting it right and wrong both ways (positive and negative). That's basically what is discussed in the other article (the cost of cancer screening tests), albeit in an informal way.

Another megabytes the dust.

Working...