Follow Slashdot blog updates by subscribing to our blog RSS feed

The Fallacy of Hard Tests 404

Posted by kdawson on Sunday June 17, 2007 @02:47AM from the do-the-math dept.

Al Feldzamen writes in with a blog post on the fallacious math behind many specialist examinations. "'The test was very hard,' the medical specialist said. 'Only 35 percent passed.' 'How did they grade it?' I asked. 'Multiple choice,' he said. 'They count the number right.' As a former mathematician, I immediately knew the test results were meaningless. It was typical of the very hard test, like bar exams or medical license exams, where very often the well-qualified and knowledgeable fail the exam. But that's because the exam itself is a fraud."

This discussion has been archived. No new comments can be posted.

The Fallacy of Hard Tests

Load All Comments

Search 404 Comments Log In/Create an Account

Comments Filter:

Worthless (Score:4, Insightful)

by kmac06 ( 608921 ) writes: on Sunday June 17, 2007 @02:50AM (#19538623)

What a worthless post. He gave one situation where guessing is more important than knowledge, but didn't at all address the specifics of the tests he was talking about. A typical vapid blog that for some reason gets posted to /.

Share
twitter facebook
- Re:Worthless (Score:5, Insightful)
  
  by Tatarize ( 682683 ) writes: on Sunday June 17, 2007 @02:56AM (#19538655) Homepage
  
  No. Guessing is simply the 25% bonus if you're one in four. The chance of passing the test is nearly null. You need to be 100 times smarter than that idiot who can only answer one question. Also, 2X as smart == 2X right answers? What the hell? My IQ is 140, find me somebody with an IQ of 70 and give us a test on anything. Sure as hell I'll get more than just twice as many right.
  
  1 for right answer.
  -1/4 for wrong answer.
  0 for no answer.
  
  Done.
  
  Parent Share
  twitter facebook
  - Re:Worthless (Score:5, Funny)
    
    by WFFS ( 694717 ) writes: on Sunday June 17, 2007 @03:01AM (#19538677)
    
    Ok... the test will be on... girls. Huh? What do you mean that isn't fair?
    
    Parent Share
    twitter facebook
    - Re:Worthless (Score:5, Funny)
      
      by loganrapp ( 975327 ) writes: <(loganrapp) (at) (gmail.com)> on Sunday June 17, 2007 @05:14AM (#19539279)
      
      A test on girls isn't fair because no matter what answer you give, it'll be wrong.
      
      Parent Share
      twitter facebook
      - Re:Worthless (Score:5, Funny)
        
        by pionzypher ( 886253 ) writes: on Sunday June 17, 2007 @06:25AM (#19539585)
        
        No. I think what he was trying to say was that no matter what, you'd never score with a girl. ;)
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:5, Funny)
        
        by EsbenMoseHansen ( 731150 ) writes: on Sunday June 17, 2007 @07:28AM (#19539803) Homepage
        
        No. I think what he was trying to say was that no matter what, you'd never score with a girl. ;)
        
        Well, as a (happily) married man and considering the 6 point odd billion global population, I'd say it isn't quite impossible to score with a girl. You just have to learn that there is no correct answer, especially on multiple choice. ;)
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Funny)
        
        by MidoriKid ( 473433 ) writes:
        
        Do not try to answer the question... that's impossible. Instead only try to realize the truth... There is no answer.
        
        Re: (Score:3, Insightful)
        
        by Planesdragon ( 210349 ) writes:
        
        If a man is walking in a forest, and he's talking to himself, and there are no women around, is he still wrong?
        If he has to ask the question, then yes. If he knows that the rightness or wrongness of his answer is the same regardless of his gender, then no.
        
        Women will date, dream of, and marry men. They do none of those to boys.
        
        Re: (Score:3, Insightful)
        
        by default luser ( 529332 ) writes:
        
        The curve exists as an admission by the tester / instructor that they cannot create a perfect test, and that they cannot fully understand their students prior to testing.
        
        If you fail people for being less than perfect, they won't LEARN anything. This is how you teach people HOW to learn.
        
        Re:Worthless (Score:4, Funny)
        
        by mobby_6kl ( 668092 ) writes: on Sunday June 17, 2007 @07:55AM (#19539883)
        
        >That just means that the maximum score is 0 :p
        
        So it's rather like darts? You start with a constant number of points, say 300, and then with each answer you give points are subtracted from your total score. The game ends when you inevitably reach 0.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:5, Funny)
        
        by The One and Only ( 691315 ) writes: <[ten.hclewlihp] [ta] [lihp]> on Sunday June 17, 2007 @08:26AM (#19540065) Homepage
        
        You just described every relationship I've ever been in.
        
        Parent Share
        twitter facebook
  - Re:Worthless (Score:5, Insightful)
    
    by Mr2001 ( 90979 ) writes: on Sunday June 17, 2007 @03:02AM (#19538685) Homepage Journal
    
    1 for right answer.
    -1/4 for wrong answer.
    0 for no answer.
    ITYM -1/3 for each wrong answer. That way, the expected value of guessing is zero: on average, out of four guesses, you'll gain a point for one of them and lose it for the other three.
    
    Parent Share
    twitter facebook
    - Re: (Score:2, Insightful)
      
      by phunctor ( 964194 ) writes:
      
      For a medical specialist wouldn't:
      
      +1 for right (patient lives)
      0 for no answer (she knows she doesn't know and maybe consults with a colleague),
      -1e38 for wrong (patient dies)
      
      be more appropriate weightings?
      
      Many medical specialists could use a tuneup on the difference between confidence and arrogance...
      
      --
      phunctor
      - Re:Worthless (Score:5, Insightful)
        
        by ultranova ( 717540 ) writes: on Sunday June 17, 2007 @06:33AM (#19539619)
        
        For a medical specialist wouldn't:
        
        +1 for right (patient lives)
        0 for no answer (she knows she doesn't know and maybe consults with a colleague),
        -1e38 for wrong (patient dies)
        
        be more appropriate weightings?
        
        No. Everyone makes mistakes sometimes; a doctor who concentrates all his efforts into avoiding them will end up sending all his patients to see one expert or another. Not only does this overload the experts (who are supposed to see only a tiny subset of the patients, after all), but it also means it takes longer to get diagnosed. And in the long run, it means that only risk-takers will become doctors in the first place, shich is not good for anyone.
        
        The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.
        
        So no, your weightings are not appropriate. You can't assign virtually infinite negative weight to failure and expect anyone to try - at least anyone you want performing medicine.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Interesting)
        
        by vertinox ( 846076 ) writes:
        
        The worst case is if the experts will also start doing this: trying to offload the patient - and therefore the risk - to someone else as soon as possible. That will lead to the people with actual serious illnesses dying, since no one will actually diagnose them in their hurry to send them to someone else before they have a chance to die on them.
        
        Have you ever seen that episode of Scrubs where they take that wealthy hospital donor to every department to try to figure out what is wrong with them, but no one kn
        
        Re:Worthless (Score:5, Insightful)
        
        by aurispector ( 530273 ) writes: on Sunday June 17, 2007 @08:57AM (#19540219)
        
        People EXPECT doctors to do something, even when nothing is wrong. I've caught myself doing it and I *am* a doctor. It's human nature.
        
        When I took my board exams I studied old exams for weeks. The information in the exams wasn't really stuff directly from the curriculum; we covered the material but the focus was slightly different. In any case large portions of the information required to be regurgitated for the exam could be classified as "background" - stuff you need to be aware of but doesn't directly affect you in your daily work.
        
        The exam WAS multiple choice and I credit test-taking skills as much as my education for passing on the first try. Logic and the process of elimination can increase your odds to about 50/50 in most cases.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:5, Insightful)
        
        by level_headed_midwest ( 888889 ) writes: on Sunday June 17, 2007 @09:12AM (#19540301)
        
        People always expect doctors to do something, even if the doctor is very vocal about there being no good treatment available. I've seen lots of people walk into doctors' offices and DEMAND a certain medication or treatment that is not advisable. A very common one used to be mothers demanding antibiotics to give to their kid who is sick with a viral flu. The doctor said in no uncertain terms that antibiotics will do absolutely nothing and that prescribing antibiotics will only cost money and perhaps have side effects. But the mothers had to have some medicine to feed to the kid just to satiate their mothering genes. Most of the docs I know told them to give the kid Tylenol if they had a fever or "prescribed" X ounces of fluids per hour- something to keep the mother mothering the kid.
        
        People will also want the doctor to do "something" even if nothing is wrong because they don't want to feel dumb for going when nothing was wrong. They want to justify that something was actually wrong so they don't feel foolish. Add to that the fact that most people have to pay some as a co-pay for a doctor's office visit and "want to get their money's worth."
        
        So sometimes picking "no action" can be very hard to do.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:5, Interesting)
        
        by ryanov ( 193048 ) writes: on Sunday June 17, 2007 @09:47AM (#19540501)
        
        I have a lot of experience with this lately, having come down with an odd virus that had no treatment but was/is excruciatingly painful. There may be no treatment available, but I wager the vast majority of these folks who go to a doctor but have nothing wrong with them DO have some symptom or another... for me, getting the symptom treated is almost equally as important as having the cause treated, as I probably wouldn't have gotten out of my chair without it. One doctor recently seemed much more concerned with the cause and the symptom was nearly an afterthought -- as a result, I was in a lot of pain for 24 hours with no way to fix it. He saw the antibiotic as more important (though it ultimately turned out not to be bacterial), but I saw something for pain to be something that should have happened immediately.
        
        Another thing -- most people want to feel like the doctor at least LOOKED for something. One doctor I went to recently made me wait 40 mins to see him and then looked at me for like 30 seconds and prescribed something. Yes, that makes sense if you know what it is straight off and know what to do about it, but you might just wanna look for other things that I /didn't/ mention, in case I have more than one thing or in case there are different diagnoses that have similar symptoms except for a couple.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:4, Insightful)
        
        by Macgrrl ( 762836 ) writes: on Sunday June 17, 2007 @09:01PM (#19545567)
        
        Here in Austrlia where we have paid sick leave for permanent employees, but typically companies require that you present a doctor's certificate to prove you were sick. So even when you know that you only have a head cold and should be home in bed staying warm and keeping your fluids up, you have to track down and wait in the doctor's office for them to write on a bit of paper that you really are too sick to go to work and that you should be home in bed...
        
        On the flip side, my husband was mis-diagnosed by a number of doctors for over 15 years - he had severe sleep apnea to the point where he was having fits and seizures, memory loss and paranoia. I look like I am finally getting a diagnosis after 20 years of intrusive tests for why I have near constant nausea, indigestion and vomiting.
        
        If the doctors didn't have to sausage factory process all the people who *know* what's wrong and what they have to do, they would probably have more time to spend with people who actually need help.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Insightful)
        
        by RyuuzakiTetsuya ( 195424 ) writes:
        
        I know for many common illnesses, even if we don't know the cause, we do know that if you just sit on your ass for a few days and take care of yourself, you're going to get better.
        
        I don't expect my doctor to actually *do* anything curatively speaking, i just expect him to be on my side when I have to tell my job I'm out for a few days getting over a cold.
        
        Re:Worthless (Score:4, Insightful)
        
        by An Onerous Coward ( 222037 ) writes: on Sunday June 17, 2007 @11:36AM (#19541245) Homepage
        
        I love Scrubs, too. But let's not go redesigning our medical qualifications system based on that one episode we saw that one time. :)
        
        I can only suppose that there are times when doing nothing beats doing something. But you seem to be saying that, because such situations do occur, then it would be healthy to severely punish medical errors to the point where most doctors' first instinct is to do nothing, run another test, etc. Even though there may be times when that state of affairs would help certain patients, on the balance I think it would make medical care worse.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:4, Informative)
        
        by Puff of Logic ( 895805 ) writes: on Sunday June 17, 2007 @01:13PM (#19541923)
        
        But you seem to be saying that, because such situations do occur, then it would be healthy to severely punish medical errors to the point where most doctors' first instinct is to do nothing, run another test, etc. Even though there may be times when that state of affairs would help certain patients, on the balance I think it would make medical care worse.
        Indeed it would. My understanding is that the cost of defensive medicine (defensive in terms of liability) is not just measured in dollars; invasive, harmful, or otherwise painful tests are often done in a full-court-press just to say that every possibility was checked, regardless of whether such tests are indicated. That we, as a society, demand a level of perfection from our doctors that is simply unreasonable to expect from any human merely exacerbates matters. A doctor cannot openly say "guys, I screwed this one up, so learn from my mistakes" because the family will be howling for compensation and the lawyers will be trying to hush it all up. A failure to act (doing nothing, as the GPP suggests) is just as damning as doing the wrong thing, so what other choice does a physician have than to fire the medical artillery, even if he thinks only a BB gun is indicated?
        
        I should immediately point out that IANAD but I hope to play one in front of an admissions committee soon, so I may be talking out of my rear. However, the above seems to be the sentiment of most doctors I've spoken to. I just got done with the MCAT recently, so this topic is a bit close to my heart! An interesting site with a good take on the situation is here [pandabearmd.com].
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:4, Funny)
        
        by try_anything ( 880404 ) writes: on Sunday June 17, 2007 @02:40PM (#19542657)
        
        I guess it depends on where you work, but my friend's experience was that things changed immediately when he got his first job. Everyone is keenly aware of the potential of a malpractice lawsuit, but the doctors talk pretty freely with each other behind the patients' backs, laughing at the nut cases and making fun of the pill tourists. One guy kept a known addict who came in with "back pain" in an exam room for six hours, coming in between his other patients, bringing exotic-looking implements into the examining room and holding them against the patient's body, furrowing his brow, making serious noises, and then disappearing for half an hour. At the end of the day he told her to take three Advil a day and "come back as often as you feel is necessary."
        
        I don't know how freely the doctors admit mistakes, but my friend tells me about his colleagues' mistakes every once in a while, so they aren't exactly secrets.
        
        Parent Share
        twitter facebook
  - Re:Worthless (Score:5, Insightful)
    
    by Derekloffin ( 741455 ) writes: on Sunday June 17, 2007 @03:09AM (#19538711)
    
    Yeah, this is a pretty bloody poor analysis. If I know 2X as much (even assuming we could quantify it that easily), that doesn't automatically mean I get 2X the score on a test, and it certainly doesn't mean my guesses are equally as bad as the guy with 1/2 my knowledge. It depends heavily on what my knowledge is and what is covered by the test. The potential is even there for the guy with 1/2 my knowledge to beat me just simply by getting lucky on what the test covers.
    Just for an example, say we were doing a geography test on the states of the united states and their associated capitals. I know 1/2 of them, and another guy knows 1/4 of them. Now, each question is a 4 part multi-choice simple question: State X, which is it's capital? A, B, C, or D. The thing is, even for those I don't know, 1/2 the potential answers (on average) I can eliminate as I know them, while the other guy, on average, can only eliminate 1/4 of them. So, I would get 50% on knowing the answers, and about 1/2 of the remaining on guesses. The other guy would get 1/4 on knowing them, and only 1/3 of the rest on guesses. And that's just the basic mathematic flaw in his reasoning.
    
    Parent Share
    twitter facebook
    - Re:Worthless (Score:5, Informative)
      
      by nephyo ( 983452 ) writes: on Sunday June 17, 2007 @03:52AM (#19538949)
      
      His argument is that the harder the test the less relevant knowledge of the actual answers to the questions posed on the test are to determining your relative score. As a result, on a very hard test, two test takers with vastly different levels of knowledge of the correct answers to the test questions do not on average end up with scores that reflect that difference.
      The "educated guess" does not contradict that argument. Again, the harder the test then the smaller the difference between the number of potentially correct answers you can eliminate versus the number that he can eliminate will be. With a sufficiently hard test, "educated guessing" makes no difference whatsoever.
      So basically with a multiple choice, count only the correct answers test, increasing the difficulty is not an effective means of increasing the likelihood of the test to accurately filter out candidates with lesser knowledge of the subject matter covered by the test. Increasing the difficulty only increases the degree to which randomness has an impact on the results.
      This is true, well known, and not very controversial. However, you would of course need to examine the specific tests in question to determine whether they are effective. They may have other features to help mitigate this effect. Also, his analysis is purely mathematical. It doesn't take into account the likelihood of a challenging test to create social pressure that influences people to self-filter. It could be argued that most of these tests are not testing the takers knowledge of the material so much as they are testing the takers ability to study and react to the pressure that the tests provide.
      
      Parent Share
      twitter facebook
      - Re: (Score:3, Insightful)
        
        by Znork ( 31774 ) writes:
        
        "It doesn't take into account the likelihood of a challenging test to create social pressure that influences people to self-filter."
        
        Mmm. I'm not sure that would be a desireable feature; that'd bias the test situation in favour of arrogant idiots. For some professions confidence may be more desireable than knowledge (marketing?), but for a doctor I think one would prefer someone being reluctantly right than someone being confidently wrong.
      - Re:Worthless (Score:5, Interesting)
        
        by Derekloffin ( 741455 ) writes: on Sunday June 17, 2007 @05:27AM (#19539335)
        
        The "educated guess" does not contradict that argument. Again, the harder the test then the smaller the difference between the number of potentially correct answers you can eliminate versus the number that he can eliminate will be. With a sufficiently hard test, "educated guessing" makes no difference whatsoever.
        Actually, the problem here is his example is a total worst case scenario and doesn't tell us what the 'Pass' level is. The tests mentioned are not relative knowledge tests, they are pass/fail tests, in other words, I don't care how much Joe knows compared to Bill, all I care about is does Joe demonstrate the necessary level of knowledge to pass. In that case, assuming the test maker has the slightest clue, in the example the pass mark would likely be at 75%+ (you need about 1/2 right legit, and 1/2 of the remaining right on guesses or better) meaning that it's difficulty is fine as it has correctly blocked both people as they didn't show the necessary level of knowledge.
        He might have a point IF he qualified this to scaled result tests (ie the top X people will pass regardless of their scores, only relative position counts), but he didn't. But, even in that case he'd have to analyze the distribution of all testees, not just 2. Once again, his math does work and doesn't support the argument.
        
        Parent Share
        twitter facebook
        
        Re:Worthless (Score:5, Interesting)
        
        by Jaidan ( 1077513 ) writes: on Sunday June 17, 2007 @07:32AM (#19539815) Homepage
        
        No, just no. The point of the tests are to determine who is over a particular threshold of knowledge and who isn't. The method being called a fraud fails to accurately do that. Since randomness has a proven substantial impact on those tests that threshold becomes blurred. To make matters worse, the harder the test the MORE randomness affects score. As a result the test results are meaningless at any scale. His examples where simplified to illustrate the essential math behind them, he does not need more than 2 people to compare since the math is equally applicable no matter how many are tested. He also does not need to set a scale because the math is equally applicable to any bar you might set.
        
        The point of the article was to illustrate that these hard tests are meant to establish a minimum required level of knowledge, however due to the nature of counting only correct answers, randomness incurs a great penalty to the accuracy of the attempted measurement of knowledgege. He is suggesting, and rightly so, that a test that instead occurs an effective 0 net effect of guessing would much more accurately measure the knowledge of the participants by reducing the effects of guessing to nearly 0
        .
        What this really comes down to is accuracy and precision. We assume that a test score can be equated to a measurement of knowledge, and for your benefit (it's completely irrelevant) we'll assume that a passing test is 60%.
        
        We give 1 person 5 different tests. We allow for random guessing with no penalty, and the test is very hard. He takes them all and scores wildly different, but averages 65% across all of the tests. If I was to know for a fact that the person in question does indeed deserve to score a 65% then we can say the test was very accurate, but low in precision. On any given test the subject may have passed or failed depending on his luck with guessing.
        
        We now give the same person 5 new tests. We this time remove randomness for the most part by penalizing wrong answers by an amount that results in an effective gain of 0 for random guessing. This time he takes the tests all his scores are within a few points of each other and infact he averages 65% again. In this case the test is highly accurate and is also high precision. On any given test the subject most likely would pass
        
        The article's math indeed illustrates this point very clearly. The unspoken point is that in tests such as these, designed to set standards to be met, it is a fraud to use a test with low accuracy at measuring actual knowledge. The precision gained by penalizing guessing allows the test to be much more fair in it's administration.
        http://en.wikipedia.org/wiki/Accuracy [wikipedia.org]
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Informative)
        
        by Firethorn ( 177587 ) writes:
        
        We give 1 person 5 different tests. We allow for random guessing with no penalty, and the test is very hard. He takes them all and scores wildly different, but averages 65% across all of the tests.
        
        Statistics show that this would be very unlikely for 5 tests with questions pulled from a common pool.
        
        The odds of WAGing a multiple choice test is 25% per question. When distributed over a hundred questions, it's very unlikely that random guessing will score above 30% or below 20%, and that's for guessing the ent
        
        Re: (Score:3, Interesting)
        
        by try_anything ( 880404 ) writes:
        
        It seems that these "extremely hard tests" are exhausting all-day or multi-day affairs with several hundred questions. With that many random events in the sample, the variance will be pretty low.
        
        A normal person may score "wildly differently" on a 300-question exam from one attempt to the next, but the variance will be based more on differences in preparation, physical and mental comfort, stress, and how much sleep he got the night before.
        The article's math indeed illustrates this point very clearly.
        The art
        
        Re: (Score:3, Insightful)
        
        by Derekloffin ( 741455 ) writes:
        
        Since randomness has a proven substantial impact on those tests that threshold becomes blurred.
        True, but the article's math does nothing to support that case as the difficulty of the test does NOTHING to hurt or help this. The test format in that case is the problem, and his example again doesn't help because this test wouldn't be used for those guys who get 55% on the test, it would be looking for those in the 75%+ range (pass could even be set at 90% maybe even 100%, he never sets it) and this on a tri
      - Comment removed (Score:5, Interesting)
        
        by account_deleted ( 4530225 ) writes: on Sunday June 17, 2007 @07:26AM (#19539793)
        
        Comment removed based on user account deletion
        Read the rest of this comment...
        
        Parent Share
        twitter facebook
        
        Measurement and item response theory (Score:4, Interesting)
        
        by Savantissimo ( 893682 ) writes: on Sunday June 17, 2007 @02:21PM (#19542491) Journal
        
        I agree with everything you said except this part:
        
        "A multiple choice question might only have one right answer and its point value is the exact same as that of something much easier (especially, when on the harder on, the wrong choice might even be 'righter' than the correct choice on the easy question) -- but thats why there is an entire field of psychometrics out there to ensure that these sorts of exams are doing what they say they are."
        
        Seems to me like that is more an example of psychometricians being forced to accept a less than valid form of test scoring. The proper way to do things has to incorporate Rasch's principle that the likelihood that a given test-taker will give the correct answer (on a question that is valid for the quantity it being used to measure) depends on the product of the easiness of the question and the ability of the test-taker. For that matter, lumped scores (pass-fail, ranking, or absolute) on professional proficiency exams - which by their nature must test disparate quantities with various non-linear contributions to professional qualification - cannot properly be interpreted as measurements of anything without a well-thought out unified criterion that describes the contributions and dependencies of the various quantities measured by the questions to the overall measurement of professional competence.
        
        Parent Share
        twitter facebook
    - The problem there (Score:5, Interesting)
      
      by Moraelin ( 679338 ) writes: on Sunday June 17, 2007 @06:41AM (#19539653) Journal
      
      The problem there is that averages are one thing, but in practice there still is a non-zero chance that he'll actually score higher than you do.
      
      Let's say it's 20 questions, 4 possible answers each. He'll know 5 of those, has to guess 15. There's even a 1 in billion chance that he'll get all 20 right. (4^15 = 2^30 = approx 1 billion.) If you gave that test in China, by now you'd have at least one guy who pulled exactly that stunt.
      
      There's also the issue of how well those questions fit your and his domain of knowledge. Let's say you can't possibly test _all_ the questions, because that's usually the case. You can do it for state capitals, but you can't possibly cover a whole domain like medicine or law.
      
      There are 50 states, you know 25, the other guy knows, say 12 (rounded down), so it's not impossible that the 20 questions are all from the 25 you don't know, but include all 12 that guy knows. In fact, assuming a very very very large domain (much larger than 50, anyway), there's about 1 in a million chance that all 20 questions will be from the 50% you don't know.
      
      Now when testing states that doesn't have a higher moral, because (at least theoretically) all states are equally important. In other domains, like medicine, law, even CS, that's not the case: stuff ranges from vital basics to pure trivia that noone gives a damn about. (Or not for the scope of the problem at hand: e.g., if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)
      
      And a lot of "hard tests" are "hard" just by including inordinate amounts of stuff that's unimportant trivia. E.g., if I'm giving a test for a unix admin job, I can make it arbitrarily "hard" by including such trivia as "in which directory is Mozilla installed under SuSE Linux?" It's stuff that won't actually affect your ability to admin a unix box in any form or shape. The fact that SuSE does install some programs in different directories is just trivia.
      
      (And if that sounds like an convoluted imaginary example, let's say that some "hard" certification exams ask just that: where is program X installed in distribution Y? And at least one version of Sun's Java certification asked such idiotically stupid trivia as in which package is class X, or whether class Y is final. Who cares about that trivia? It's less than half a second to get any IDE to fill in the package for you. E.g., in Eclipse it just takes a CTRL+SPACE.)
      
      And in view of that previous point, including trivia in an exam just to make it "hard" is outright counter-productive. There is a non-null chance that you'll pass someone who memorized all the trivia, but doesn't know the basics.
      
      Not all knowledge is created equal, and that's one point that many "hard" exams and certifications miss. If a lawyer doesn't know the intricacies of Melchett vs The Vatican, who cares? In the unlikely situation that they need it, they can google it. If they don't understand Habeas Corpus, on the other hand, they're just unfit to be a lawyer at all. Cramming trivia into an exam can get you just that kind of screwed up situation: you passed someone who happened to know that Melchett vs The Vatican is actually a gag question, and that case name appears in Stephen Fry's "The Letter", yet flunked someone with a solid grasp of the the basics and who knows how to extrapolate from there and where to get more information when he needs it.
      
      Rewarding random guesswork is worse. Probably the most important thing one should know is what he _doesn't_ know, so he can research it instead of taking a dumb uninformed guess. Most RL problems aren't neatly organized into 4 possible answers, so it can be a monumental waste of time to just take wild guesses and see if it works. I've seen entirely too many people wasting time trying wrong guess after wrong guess, instead of just doing some research. E.g., I've actually witnessed a guy trying every single bloody combination between *, & and nothing in front of every single variable in a C function, because he never understood how poin
      Read the rest of this comment...
      
      Parent Share
      twitter facebook
      - Re:The problem there (Score:4, Insightful)
        
        by that this is not und ( 1026860 ) writes: on Sunday June 17, 2007 @08:11AM (#19539959)
        
        Just to pull out a snippet and maybe contribute a bit to topic drift:
        
        if I'm hiring a Java programmer, asking questions about COBOL would be just trivia.)
        
        If you ask that sort of question to a prospective programmer, you'll find out more about the person's technical depth, which may be of value. The guy who 'learned Java' because he read it somewhere or an 'advisor' told him it was a way to 'get ahead' is gonna be mister lightweight who is looking for a 'career,' not somebody who is a practitioner who takes a broad approach.
        
        Further, it will help sort the candidates out. The ones who contrive 'fake' knowledge of COBOL can be rooted out and eliminated. Those who are willing to say 'I am not sure I know, but that's an interesting queston' get points, those who automatically start thinking about where to find the answer get even more points.
        
        And, of course, the question will help to sift out anybody with actual COBOL knowledge, because anybody with skill in COBOL who is applying for a Java position is obviously an unstable nut.
        
        Parent Share
        twitter facebook
  - Re: (Score:2, Insightful)
    
    by Score Whore ( 32328 ) writes:
    
    He also assumes that you either know the right answer or know nothing. Here's a pretty hard test for him where a person with some knowledge but without the actual answer will do better than a person with no knowledge:
    
    1. What number am I thinking of?
    
    a) cheese
    b) galaxy
    c) 3
    d) 1
    
    A person who knows (literally) nothing has a 1 in 4 chance of getting it right. A person who knows what a number is has a 1 in 2 chance. You stick one hundred questions on a test and someone who is versed in the material will score bette
    - Re:Worthless (Score:4, Funny)
      
      by Anonymous Coward writes: on Sunday June 17, 2007 @04:08AM (#19539009)
      
      a) cheese
      
      I have the same combination on my luggage!
      
      Parent Share
      twitter facebook
  - - Re: (Score:3, Insightful)
      
      by TheRaven64 ( 641858 ) writes:
      
      They should also know that IQ relates to a particular subset of reasoning skill, not to knowledge, and definitely not to knowledge in all fields. If you gave me and someone with half my IQ a test on, say, baseball then they would almost certainly do better than me; all of my answers would be guesses and so any knowledge that they had would give them an edge, no matter how stupid they were. This, of course, raises a problem that is present in a lot of exams - even a few on my degree course - that they test
  - - - Cartel Tests (Score:3, Interesting)
        
        by mdsolar ( 1045926 ) writes:
        
        Actually it has to be a % passing. If the supply of licensed doctors and attorneys were not limited, the costs for their services would reduce, so these exams have to be a part of the the system to control the supply. A test may be written to ensure a spread (so it tests knowledge) and also to ensure that the passing score is largely unattainable. So, I think the analysis is incorrect. The tests are not too hard to be useful as tests, it is just that their is a conflict of interest as regards their use.
- Re: (Score:2, Insightful)
  
  by IP_Troll ( 1097511 ) writes:
  
  Agreed this post is worthless.
  
  Has the author of this blog got any scientific results to back up his claims? The NY State Bar has a statistical analysis of who passed its bar exam. http://www.nybarexam.org/NCBEREP.htm [nybarexam.org]
  like bar exams or medical license exams, where very often the well-qualified and knowledgeable fail the exam.
  IMHO there are only two reasons why the well-qualified and knowledgeable fail such exams.* They didn't study or they studied the wrong materials. We have all had that one exam we did REALLY poorly on and we would like to blame someone other than ourselves for our bad grade. This post
- Sometimes guessing is a good thing (Score:3, Interesting)
  
  by Artifice_Eternity ( 306661 ) writes:
  
  In many professional specialties, including law and medicine, there are times when a quick, decisive educated guess may produce better results than an exhaustively researched, definitively confirmed answer.
  
  So tests that force students to do a lot of guessing may still be good tools for evaluating their professional qualifications.
  
  A doctor or lawyer who can guess right may be superior to one who plods to the right answer only after many expensive lab tests or hours of legal research. That's not to say that
- The real problem with multiple choice tests (Score:2)
  
  by ctwxman ( 589366 ) writes:
  
  After a thirty plus year break, I took 53 additional distance learning college credits to complete a certification. All of our quizzes and tests were multiple choice. The biggest failure of the tests were the answer choices! Because writing skills are so poor, the majority of the tests had at least a few questions and answers which didn't say what the professor thought they said. As a student, I often had to choose between what I knew he meant and what he actually said.
  A close second was counting neg
When I was a boy... (Score:4, Insightful)

by WFFS ( 694717 ) writes: on Sunday June 17, 2007 @02:59AM (#19538671)

Stories like this could never get on Slashdot. Seriously, this is like a maths problem I'd give to my Year 9 kids. This is definitely not news, and certainly doesn't matter.

Share
twitter facebook
Yuck (Score:2)

by venicebeach ( 702856 ) writes:

It's hard to believe this guy is really a mathematician. I read this with interest as I teach college classes and have to give tests. However, there's not much content in the article.

His point about only counting the correct answers is rather silly. In a test where each question is either right or wrong, counting the wrong answers into the score does not add any information (you can tell how many are wrong if you know how many are right). The only thing it does is change the scaling of the resulti
- There may be unanswered questions (Score:4, Interesting)
  
  by dybdahl ( 80720 ) writes: <info@dy[ ]hl.dk ['bda' in gap]> on Sunday June 17, 2007 @03:10AM (#19538717) Homepage Journal
  
  If you have 100 questions, and 20 right ones and 20 wrong ones, it leaves 60 unanswered questions.
  
  That's why the articles talks about only counting right ones. In order to avoid guessing, there should be a difference between picking a wrong answer and not picking an answer at all.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Funny)
    
    by Bigos ( 857389 ) writes:
    
    Somebody has done it before. I applied for a job as an English language teacher, and a lady interviewing me said that it is company's policy to test every applicant no matter what certificates and diplomas they have. So i was given the test quickly done the 2/3 of it and then discovered that in the most difficult rest all the answers were wrong. I noticed that some of the answers were SLIGHTLY INCORRECT, so after correcting them i marked them accordingly I have passed pack the test paper. Later the lady tol
    - Re: (Score:2)
      
      by OverlordQ ( 264228 ) writes:
      
      Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.
      - Re:There may be unanswered questions (Score:4, Insightful)
        
        by UnxMully ( 805504 ) writes: on Sunday June 17, 2007 @05:20AM (#19539299)
        
        Jesus christ, hopefully you didn't get the job, it was harder then fuck to understand what the hell you just said.
        
        Fate, it seems, is not without a sense of irony.
        
        Parent Share
        twitter facebook
  - Re: (Score:2)
    
    by DoctorFrog ( 556179 ) writes:
    
    In my military training school I was doing very well and decided to test a rumor I had heard, so I deliberately answered every question incorrectly. Lo and behold, I was awarded a score of 100 for my efforts! :)
- Re: Yuck (Score:3, Insightful)
  
  by reason ( 39714 ) writes:
  
  You're missing the point. Counting only correct answers on a multi-choice test doesn't measure what you know, or whether you have the necessary minimum knowledge.
  
  With 4 choices for each question on a 100 question test, the average student (student A) who knows 50% of the answers will get at least 62 correct if they guess entirely at random when they don't know the answer (50 plus 50/4 correct guesses). The average student who knows only 25% of the material (student B) will get at least 44 correct using th
- Re: (Score:2, Insightful)
  
  by KDR_11k ( 778916 ) writes:
  
  Subtracting points for wrong answers is supposed to encourage students to skip a question if they don't know what to say rather than give a wrong answer. If someone gets 48% right from his knowledge he can't spray and pray for the remaining 2%.
- Re: (Score:2)
  
  by suv4x4 ( 956391 ) writes:
  
  His point about only counting the correct answers is rather silly. In a test where each question is either right or wrong, counting the wrong answers into the score does not add any information (you can tell how many are wrong if you know how many are right).
  
  You're wrong. There are three ways you can handle a question: answer correctly, answer wrongly, not answer.
  
  The fact that tests count only correct answers means the subtle difference between not answering, and answering incorrectly is lost.
  
  Take for an ex
  - Re: (Score:2)
    
    by Bastard of Subhumani ( 827601 ) writes:
    
    There are three ways you can handle a question: answer correctly, answer wrongly, not answer.
    Aren't there are some tests where you can't skip an answer - I thought the computer based GMAT was like that?
    
    But most tests are like you say - and Jack can take advantage of his ability to know that he doesn't know (which proves he knows something!).
The fallacy of penalizing guessing (Score:2)

by iamacat ( 583406 ) writes:

Suppose the test is really hard and contains many answers which are wrong, but can be thought as correct by a person who is moderately knowledgeable about the question. Now if you penalize guessing, I may answer 20 questions correctly and 80 with "reasonable" answer which are not correct, my score is 0 assuming 4 questions per choice. On the other hand, someone who answers 10 questions correctly and puts random guesses for the other 90, will likely get a score close to 10.

Basically, multiple choice tests wh
Statistical exam using Multiple choice (Score:2)

by dybdahl ( 80720 ) writes:

I haven't had many exams with multiple choice, but my university statistics course was one of them.

Each question had 5 options, and only one was correct. A correct answer gave 5 points, an incorrect answer gave -1 point.

Now, as the smart reader can guess, 4 x -1 + 5 = 1, so guessing still pays off... especially if one or more of the questions are very unlikely to be correct.

Did the teacher design this test incorrectly, since guessing was rewarded? Well, actually, the only test of real-life application of st
- Re: (Score:2)
  
  by crossmr ( 957846 ) writes:
  
  And those who didn't understand it but guessed anyway were just as rewarded...
Education in taking the test (Score:5, Insightful)

by MagicDude ( 727944 ) writes: on Sunday June 17, 2007 @03:10AM (#19538719)

As a medical student, I know how much our education is divided into what we do in real life, and what is the proper answer for exams. Quite often, during our education exercises, we're given senarios like "A patient presents with symptoms X, Y and Z. What do you do next?". At that point, that's when the resident says "You would diagnose condition A from those symptoms, but for the exam, you'd say you'd get an MRI to rule out B". So many questions are basically having intuition for where the question is guiding you too, rather than practical medicine. Often, it's extremely difficult to discern what the question wants. There will be some question along the lines of "A patient presents with general fatigue over the past 3 months, which one blood test do you want to order?" and you'll narrow down the answer choices to either thyroid stimulating hormone, or a complete blood count, both studies are equally important in the evaluation of fatigue, but the question wants you to know which one is more important. In real life, you would always get both because both conditions fairly common, and you want to evaluate both at once to save the patient time and effort. However, the question will nail you if you don't know some obscure study which states that there like is a 1% difference in the incidence of hypothyroidism vs anemia in fatigue. Moreso, if you were on the hospital floor and you were to say "I'm getting only a CBC, because it's more likely," the resident will chide you for not considering hypothyroidism as well and getting the Thyroid stimulating hormone as well, making you look bad. So yeah, learning for the test doesn't really ever end.

Share
twitter facebook
- Re: (Score:2)
  
  by Alioth ( 221270 ) writes:
  
  The best ones are the FAA tests for aviation - you often get a question, and then three right answers to pick from. It's just one answer is a little more right than the others!
  
  My "favorite" multi-choice exams at school (or 'multi-guess' as we called them) were the ones where getting the first question in a series of several wrong, would doom you to getting all of the questions in the series wrong because the answers were all dependent on calculations from the answer of the first question! Of course, being m
Not Worthless (Score:3, Insightful)

by deskin ( 1113821 ) writes: on Sunday June 17, 2007 @03:22AM (#19538789) Homepage

Though some of his logic was overblown (see the comments made directly on his blog), I think his larger point has some merit. In fields which require lots of studying before beginning as a professional, such as medicine and law, you always hear that you have to be absolutely brilliant to 'get in'. The fact of the matter is that this is not the case: you should be darn smart, but you needn't be the best student in the world to be successful as a doctor. Many of the students who go to law or medical school (I'd guess most) are completely qualified for positions in their respective fields, but by the same token, are not necessarily any more qualified than their peers: they've all studied the same material, had the same experience in the lab, and know the whole picture within a reasonable approximation of each other.

Yet to maintain the level of exclusivity that these careers have, there must be some way to select a subset of the candidates to proceed, and at this point, there are few distinguishing features among them. Some will be far and away brilliant, and will easily get a career regardless; but the majority can't be differentiated from one another. So, how should it be decided who is a doctor and who isn't? By making a test that's so hard it amounts to a randomising function, and then selecting a subset of top scorers to pass. Passing doesn't mean one is inherently more qualified; it just means one guessed better on that day. This also explains why people can pass on their second or third try: they are no better than their competitors the next time around, but eventually one will guess luckily, and get in. It'd be interesting to do some statistical analysis on how many tries it takes people to 'pass' a particular exam, and see if the results fit probabilistic models: If the results of such analysis fit too well, the test is too hard, whereas if they deviate greatly from probabilistic expectations, then the test is more likely to be an actual test of one's knowledge.

To be sure, there will be some individuals who can pass based entirely on their knowledge, just as there will be some individuals who simply aren't cut out for life as a lawyer that will fail the exam. But ultimately, it allows the higher-ups to select candidates for job positions based on the single indisputable criterion of the candidate having passed an exam, thus avoiding any messy issues when someone complains about them choosing a particular candidate in lieu of one better qualified.

Time for a terrible analogy, since it's 0300 here: Really hard exams are the bouncers at the door to the club of medical careers.

Share
twitter facebook
I had a teacher... (Score:5, Funny)

by coldmist ( 154493 ) writes: on Sunday June 17, 2007 @03:25AM (#19538801) Homepage

in college that gave very hard tests. Intel Assembly class. For a midterm, we had to decipher Object-Oriented Assembly, and decipher self-modifying code. After 3 weeks of introduction to Assembly.

I got an A, with an average of 58% in the class.

For the 2-hour final, he got up at the 1-hour point, and yelled: "The test is over. All pencils down." We just sat there dumbfounded for about 10 seconds, and then he said, "Just kidding. I always wanted to do that."

Ya, a real great pal there!

Worst teacher I had in college. He didn't last long

Share
twitter facebook
- Evil profs rock. (Score:3, Interesting)
  
  by Cordath ( 581672 ) writes:
  
  Funnily enough, one of the most hardassed profs I ever had also taught the introductory assembler class. (except for us it was PDP-11 and 68K) His tests were legendary for their difficulty, and the average was somewhere in the 20-30% range. However, it was curved after the fact and was a perfectly valid exam since there was absolutely no opportunity to guess. He gave us self-modifying assembler code too, without telling us such a thing was possible in advance! He also had a unique way of assigning read
- Re:I had a teacher... (Score:4, Insightful)
  
  by dcollins ( 135727 ) writes: on Sunday June 17, 2007 @10:22AM (#19540707) Homepage
  
  That guy's a fucking asshole. As a college teacher of math & CS (including assembly -- admittedly at a community college), guys like this just completely burn me up. Some people should completely not be teachers, they suck so fucking bad.
  
  I practically meditate before a final exam on how to make the environment as comfortable as possible, clearly explain in advance what the procedures will be like, and keep everything in the same rhythm as all my prior tests. Just freaking out students in a final exam because you're a sadist is utterly unacceptable. Jesus.
  
  Parent Share
  twitter facebook
My experience (Score:5, Interesting)

by Tim_UWA ( 1015591 ) writes: on Sunday June 17, 2007 @03:26AM (#19538811)

I once had a test that had a check box for how confident you were your answer was correct, that affected your score the following way:

If you ticked "confident" and you were wrong, -2
If you ticked "confident and you were right, +2
If you ticked "unsure" and you were wrong, -0
If you ticked "unsure" and you were right, +1

I guess the point is that it's advantageous to guess, but only if you choose the lesser-scoring option.

Share
twitter facebook
- Re: (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  At last! A scoring method that will naturally penalise me for my lack of self-confidence!
- Re: (Score:2)
  
  by antifoidulus ( 807088 ) writes:
  
  You aren't talking about the Academic Game "Propaganda"(under the new rules anyway, I'm such an old timer I can remember when your score was based on consensus and the answer, but anyway) are you?
- Re: (Score:2)
  
  by NerveGas ( 168686 ) writes:
  
  It wasn't quite that bad, but I had a teacher who would give you 1 point for a right answer, no points for no answer, and take away two points for a wrong answer.
- Re: (Score:3, Interesting)
  
  by The One and Only ( 691315 ) writes:
  
  That reminds me of my EE final--at the end, we had the option to guess our final score. There was a mathematical formula applied to the absolute value of the difference of the estimated final score and the actual final score. If you were close enough, you'd gain points. If you weren't close enough, you'd lose points. Of course, you could always elect not to do it.
Well, that explains it. (Score:5, Funny)

by Rumagent ( 86695 ) writes: on Sunday June 17, 2007 @03:43AM (#19538889)

TFA makes sense. Observe:

News for nerds?: yes[ ] no[x]
Stuff that matters?: yes[ ] no[x]

Clearly the editorial process is fraudulent - as this is a multiple choice, it is obvious that guessing tends to count much more than knowledge.

From this we can conclude one of two things:

1) Zonk is bad at guessing
2) The author is speaking out of his ass

Tempting as it is, I am going to stick with 2... But I could, of course, be guessing.

Share
twitter facebook
Mutliple choice is bad to test knowledge anyway (Score:3, Insightful)

by aepervius ( 535155 ) writes: on Sunday June 17, 2007 @03:51AM (#19538939)

I love the exams we had : a question was posed or a problem stated which required the knowledge we had learnt to solve it. Eventually there is more than one question asked to offer a lead. But no answer given. Those are real test. Applied Knowledge. Usually for multi choice with a very basic knowledge of the subject you can sort out formany response the one being the most probable. This is how I breathed through my english Multiple-Choice at the university, and hell, look at how bad (or how good ;)) my english is. Face it multiple choice might be an easy way out for professor to correct exams, but they are the poorest choice to test the knowledge and habilitiy to reason of the student.

Share
twitter facebook
What? (Score:2)

by Bastard of Subhumani ( 827601 ) writes:

Suppose the test is so hard that I, with lesser knowledge, can only answer one question based on actual knowledge. I answer that question, and guess at the other 99. You, who know twice as much as I, can answer two questions based on knowledge. So you guess at 98 answers.

As you can readily imagine, the odds of you getting a higher grade than I are very slight. In fact, over 45 percent of the time, in repeated trials, I would outscore you, even though my knowledge is half that of yours.
I'm confused (or he is
- Re: (Score:2)
  
  by antifoidulus ( 807088 ) writes:
  
  In the long run yes, but that is pretty meaningless if the test is offered only once a year....
I'll tell you about hard tests... (Score:2)

by NerveGas ( 168686 ) writes:

I had a physics professor for two entire physics series. This man was... a machine. He was VERY intelligent, and was a VERY good teacher. He was, however, quite anal. He would not expect you to know things he hadn't taught, but he expected you to know what he had taught with *perfect* mastery.

He provided copies of all former tests, along with answers and how to solve them, to the local copy store for students to buy (amazingly, this prof DIDN'T try to take you on them, the only
- Re: (Score:2)
  
  by OverlordQ ( 264228 ) writes:
  
  Unless you're dealing with one of the odd branches of Calculus, having taken Cal I and II, and looking through Cal III, I havent seen anything that should take you an hour to solve, nor anywhere close.
I find Mr. Feldzamen's post hard to believe. (Score:5, Interesting)

by mbstone ( 457308 ) writes: on Sunday June 17, 2007 @04:57AM (#19539223)

Mr. Feldzamen claims to have passed the Virginia bar exam, but I can't find any evidence he was ever admitted to the Virginia bar, or to any state bar (he's not in Martindale-Hubbell). He cites the Virginia bar exam -- which I also passed (IAAL, licensed to practice in CA and VA) -- as one of his examples of a "complete fraud." In fact, when I took the Virginia bar exam it had over a dozen one-hour essay components, testing each and every possible subject. By contrast, the California bar exam, had essay tests covering six randomly chosen subjects out of a possible 15 or so, and it had other non-multiple-choice components. The multiple-choice section of every state's bar exam, the Multistate Bar Exam, is no walk in the park. So I don't understand how he includes bar exams in his claim that the tests are invalid. If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.

Share
twitter facebook
- Re:I find Mr. Feldzamen's post hard to believe. (Score:5, Insightful)
  
  by nagora ( 177841 ) writes: on Sunday June 17, 2007 @06:19AM (#19539557)
  
  If anything, the low pass rate of bar exams, typically 50% or less among a candidate pool of mostly recent law school grads, suggests that they are very hard indeeed.
  It doesn't actually suggest anything other than 50% of people that apply pass. I can design an exam which is very easy; I then say that only 50% will pass. It could be that the "cut" is anyone who scored 9+ out of ten will pass and everyone else fails. Or I could flip a coin. The pass rate is no guide to how hard an exam is nor how good a test of the candidates' abilities. It might be both hard and rigorous, but you can't infer that just from the pass rate.
  TWW
  
  Parent Share
  twitter facebook
- Re:I find Mr. Feldzamen's post hard to believe. (Score:5, Informative)
  
  by Old Wolf ( 56093 ) writes: on Sunday June 17, 2007 @08:42PM (#19545451)
  
  Did you actually read the article? His whole point was that the multi-choice test is invalid because it is too hard.
  
  Parent Share
  twitter facebook
Disturbing (Score:5, Insightful)

by bryan1945 ( 301828 ) writes: on Sunday June 17, 2007 @05:02AM (#19539239) Journal

I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer. Sure, some multi questions, but mostly show all your work or explain the whole process. And I just design systems and networks! Now someone can just luckily guess enough multiple choice questions and start slicing me up?

Like I said, disturbing.

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by nomadic ( 141991 ) writes:
  
  I find the fact that medical and lawyer exams are based on multiple choice rather disturbing. As an engineer almost all of my test were long answer.
  
  It's done the exact same way for engineers as doctors and lawyers; what they're talking about here is the professional licensing exam, not the exams given in school. The exams in law school (and I believe medical school) tend not to be multiple choice.
  
  Law school exams, for example, tend to revolve around very long, very hard, very convoluted essays. They
- Re: (Score:3, Informative)
  
  by Brother Seamus ( 937658 ) writes:
  
  Almost all of the Professional Engineering certification exams [ncees.org] in the United States are multiple choice, with no penalty for guessing incorrectly.
Choices (Score:3, Funny)

by bryan1945 ( 301828 ) writes: on Sunday June 17, 2007 @05:09AM (#19539263) Journal

A person has heartburn, do you:

A) Perform a colonoscopy
B) Perform open heart surgery
C) Tickle him
D) Fart
E) Refer him to Cowboy Neil

I'm going to Mexico for my next check up. At least you'll get tequila first....

Share
twitter facebook
He is totally and completely wrong. (Score:5, Interesting)

by kklein ( 900361 ) writes: on Sunday June 17, 2007 @08:12AM (#19539965)

Ugh. I just wrote a pretty polite reply at his page after skimming his idiotic article. Now that I've read it, I'm actually angry.
This guy knows NOTHING about testing. Nothing. He isn't even to the level of Classical Testing Theory (CTT), which is really not much more than means and Pearson correlations, and is nowhere near how high-stakes (and even medium- and low-stakes, increasingly) multiple choice (MC) tests work now, and how they have worked for many many years.
IAAP (I am a psychometrician). A big part of what I do for a living is design a particular MC test, pilot the items, and interpret the results. But I don't just count up the correct items and give you the percentage. Why? Because that would be insane. You can guess on those.
Oh, but he says this:
But suppose the grading attempts to adjust for guessing. There is no way of knowing what is in the mind of the test-taker, so the customary is to subtract, from the number correct, some fraction of the number wrong.
--Which is just fine until I tell you I have NEVER heard of dealing with guessing that way on a professional-level test.
As a general rule, we don't do any easy mathematics. At all.
Here is part of the output for a test I'm working on right now:
Seq Item Type Location SE FitResid DF ChiSq DF Prob 35 I0035 Poly 0.685 0.089 2.239 525.69 15.636 8 0.05 36 I0036 Poly -1.946 0.165 -0.587 525.69 6.754 8 0.56 37 I0037 Poly 0.02 0.093 2.603 525.69 12.704 8 0.12

This is generated by RUMM2020, a tool for Rasch analysis. The Rasch model was developed in the 60s as an ideal model of item response. These are the stats on 3 items of this test. The two most important columns are Location and Probability.
The location is the item difficulty. Given the sample's performance on this item, and given their ability, how hard is this item? Item 35 is quite difficult; item 36, quite easy.
The probability is the p value for the chi square. Basically, if it's 0.05 or below, that item is operating significantly (statistically significantly, that is) outside of the model. It displays poor "fit." we generally toss these items before going on to the next step (ideally, these are weeded out during pilot testing, before the test goes live--in this case, it is an experimental test of a construct I'm not even sure exists anymore, but I digress). If an item has poor fit with the model, it is too much of a loose cannon, and its results cannot be trusted. This is what the benighted blogger (is there any other kind?) was whining about. That item is hard not because it is good, but because it is evidently stupid. The responses are all over the place, which means people were probably just guessing. Out it goes before it ruins any examinees' lives.
The next step is to get person locations. In the case of people, these numbers indicate the person's ability. This is calculated by looking at their performance on the items, given their difficulty (Which is calculated based on people's performance on them! Incestuous! But given a large enough sample, it all works out to a fine enough grain to be useful). Here is the output for the people:
ID Total Max Miss Locn SE Residual DegFree DataPts 1 67 125 125 0.254 0.21 -0.272 123.60 125 2 77 125 125 0.700 0.21 -0.178 123.60 125 3 86 125 125 1.120 0.22 -1.030 123.60 125

So, the first person didn't do so hot; the last did pretty well (these usually top out at 3ish). As you can see in "DataPts," there were 125 items on this test. I started with 160. Do you hear that, Mr. Unexpected "Truths?" We have your back! We're not just handing you a naked score based on our crap items. WE PULL THE CRAP ITEMS.
That location score will usually be rescaled to something prettier, since no one would really like to see something like
Read the rest of this comment...

Share
twitter facebook
- Re:He is totally and completely wrong. (Score:4, Interesting)
  
  by Ruie ( 30480 ) writes: on Sunday June 17, 2007 @08:43AM (#19540151) Homepage
  
  I have not heard about Rasch model, thank you for the explanation.
  I just want to point out that I never saw anyone use it (or anything else similarly complex) in universities, even on math departments. Typically grading the test is just the matter of summing up right answers (perhaps with partial credit) and then chopping the distribution into three parts As, Bs and Cs. A good reason for that is people perceive the grading being fair when they can predict how much answering a particular question will benefit them.
  
  Parent Share
  twitter facebook
- - Re:He is totally and completely wrong. (Score:4, Informative)
    
    by kklein ( 900361 ) writes: on Sunday June 17, 2007 @10:30AM (#19540747)
    
    Well, a poorly-written item will always be out-fitting. If the answer doesn't match the question, then everyone will have to guess. If everyone has to guess, the information curve (a great graph I'd love to show you, but can't here, and need to go to bed) will be about flat. There should be a big hump that shows that it gives us a lot of information about people a certain number of standard deviations above or below the mean. Questions like you describe won't have that.
    Also, I wrote about this in another comment, but a lot of the items you get on high-stakes tests don't really go toward your score. They are actually pilot items that the company is trying out a few thousand times to see if they work correctly before they start contributing to anyone's score.
    I don't know any cheap item-writers, though. Everyone I know in this field has at least a master's degree, and most have PhDs. We don't come cheap.
    As for being a good or bad item writer, there are just a handful of rules to follow to avoid the big blunders. After that, it's all about taking them for a spin and seeing how they handle. As I've said a few times now, there's no telling what will happen when you release these things into the wild. I've written items that I doted on and cared for and nurtured and cuddled and put my all into, fully expecting that they would grow up to be model items, ones that the other items would look up to and aspire to becoming, only to be totally and utterly betrayed by them in real-world piloting, my time and devotion wasted, finally having to drag them out back and shoot them in the back of the head. On the other hand, there are sometimes items you add to a section last-minute, just trying to get the number up for piloting or whatever, and find that you have written some ridiculously wonderful item purely by accident.
    It gets easier with practice, though. To be fair, I'm not a very good item-writer. But that is why I, especially, need the stats.
    
    Parent Share
    twitter facebook
The worst multiple choice, ever (Score:3, Interesting)

by freeweed ( 309734 ) writes: on Sunday June 17, 2007 @08:28AM (#19540079)

Had an algorithms prof (of all things) give us a test where every question had the following possible answers:

Yes, No, Sometimes, Maybe, Unknown

Then, he had questions like 1. Some scientists believe than P=NP?

To which, of course, you could argue ANY answer is correct. ..

That being said, this blog post comes across as the usual whining we've all done or had to put up with through the years. No testing methodology is perfect, and everyone tests different on different kinds of tests. Fact is, though, they're pretty damn good. It's a common belief that millions of people who are otherwise idiots are graduating with great grades, while millions of geniuses can't test well - but that's horseshit. The majority of people manage to test at their level of understanding. The fact that people actually notice the odd idiot who guesses well is the exception that proves the rule.

Share
twitter facebook
Medical Specialist Exams have an Oral Component (Score:4, Informative)

by neoshmengi ( 466784 ) writes: on Sunday June 17, 2007 @09:30AM (#19540409) Journal

The hardest part of most medical specialist exams are the orals. Nobody ever complains about the written component. You get a to sit in a room with one or more examiners for a few hours of intense grilling. There is no way to hide any lack of knowledge and your deficiencies are exposed for all to see.

Also the US has a strange system of certifying specialists. After completing residency (usually based on putting in your hours) you can practice medicine under the application 'board-eligible.' Once you've passed your exams, then you can be called 'board certified.'

In Canada, you can't practice at all unless you pass your board (Royal College) exams. The exams are reputedly harder in Canada as well (from those I know who have written both).

Share
twitter facebook
I want to know what my students know (Score:4, Interesting)

by DynaSoar ( 714234 ) writes: on Sunday June 17, 2007 @09:31AM (#19540415) Journal

I don't care what they don't know.

I give multiple choice exams with between 100 and 200 questions, and 4 possible answers.
Wach correct answer is worth 2 points; they need to answer 50 correctly to get 100.
They don't HAVE to answer any question, or any number of questions. If they can answer 30 questions, they can get a D. Any question answered incorrectly is -1 point. This serves two purposes.
It prevents guessing, and it forces the student to consider whether they actually know the answer, or just think they do.

I typically give 4 of these per semester. After the first one I usually get several complaints because they're not used to testing in this way. After the second I usually get one or two stating they can't break the habit of answering every question. After the final, I get many compliments and high marks on my evaluations, and the students tell me they are much more confident in what they've learned than from any other class. I've had occasion to run across previous students from years past, and they claim they still remember more from my class than from others.

I've had administrators forbid me to do it this way. I did it anyway. When they saw the results, they relented, and many suggested the process to others.

Share
twitter facebook
- Re:warning moronic blog post linked (Score:5, Insightful)
  
  by suv4x4 ( 956391 ) writes: on Sunday June 17, 2007 @03:11AM (#19538721)
  
  if anything testing has become FAR FAR too easy, people pass CS courses and come out the otherside only to have a vague notion of how a computer works.
  
  I won't claim his post is correct or not, but he claims the technology behind such tests is wrong and lets less educated people pass through with guessing, whle more educated people try to pass without guessing and fail.
  
  People see the tests produce poor selection, and make the tests harder and harder in attempt to remedy this (but they won't since it's the technology of a test that's wrong).
  
  Then you come here and support his opinion 1:1 by claiming tests are too easy (i.e. should be harder) and idiots pass through.
  
  Ironic, isn't it.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by KDR_11k ( 778916 ) writes:
  
  I think what he should have said is that multiple choice tests are a stupid idea (it's okay if one or two questions are a block of multiple choice lines but not the whole test). Let the student explain things with his own words.
- all multi choice questions suck , bad design (Score:2)
  
  by cheekyboy ( 598084 ) writes:
  
  Look, its all bad 19th century design.
  
  If the question said, pick the 'odd ones out' each worth n% its better.
  
  There is no wrong or right unless the answer says so. But did the person designing the questions have a degree in writing/psychology/reading aswell?
  
  Its easy to know who is a rope learner, vs a true genius, even Hawking flunked a lot at school.
- Re: (Score:3, Informative)
  
  by Looshi ( 1038712 ) writes:
  
  I just skimmed TFA, but it seemed to me like he was advocating a guessing penalty.
- Re: (Score:2, Insightful)
  
  by tepples ( 727027 ) writes:
  
  This is really a question of statistics not of mathematics.
  Statistics is a branch of mathematics.
  - Re: (Score:2)
    
    by starwed ( 735423 ) writes:
    
    But statistics also means the raw data; I think the OP was saying that this is better answered through experiment than theory.
- Re:Check the statistics not the mathematics! (Score:5, Funny)
  
  by antifoidulus ( 807088 ) writes: on Sunday June 17, 2007 @04:09AM (#19539017) Homepage Journal
  
  Having done experiments on MBA students
  
  See, I KNEW they were good for something. Let me guess, the reason you opted for MBAs over mice is that there is far less protests when you do cruel medical experiments on the MBA students than with mice, correct?
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by laffer1 ( 701823 ) writes:
  
  Yes, its very common both in k-12 and higher education. Even in college, at least half of the questions asked on exams are multiple choice. I've seen them in History classes, English classes, Computer Science, Religion courses, Math and Economics. Traditional sciences as well.
  
  English courses are the strangest. I took a literature course and our final was a multiple choice test on several books we had to read. I felt like I was in 4th grade all over again. However, my personal experience has been that En
- Re: (Score:2)
  
  by Kadin2048 ( 468275 ) * writes:
  
  Yes, they're big in the U.S. Particularly at large universities, I'd wager that they are the dominant form of testing.
  
  I went to a very small college for undergrad, so my experience is different, but I've had friends who went to huge state unis, and describe many classes where there was virtually no interaction with the professor besides multiple-choice tests. They are heavily used because they can be easily graded via automated systems. (Fill-in-the-bubble, aka "Scantron" forms, usually.) All quizzes, tests
- - Re: (Score:2)
    
    by fbjon ( 692006 ) writes:
    
    I've never even seen a multiple choice test in university here (Finland). They got left behind when I graduated high school. I think the only way they should be used is for very large amounts of test-takers (like national high-school graduation exams), or poll choices, it just doesn't match up to a Real test. That said, in a large test, having one or two questions to be simple multiple-choice should be ok.
- Re: (Score:2)
  
  by Gromius ( 677157 ) writes:
  
  Yes he completely stuffed the maths on this one. Really really badly, its kinda weird. He makes a good point but his mathematical example is laughable and incorrect and shows some serious misunderstanding of the effect he is trying to describe although maybe it was just a passing mental block on it.
  
  Anyway what he wanted to do was weight the subtraction factor so that if you have no knowledge and just guess everything you should get on average a score of zero. For true or false this subtraction factor is 1.
  
  T
- Re: (Score:2)
  
  by DerekLyons ( 302214 ) writes:
  
  Not only that - I stopped reading when I realized the logic of the example depended on the ludicrous proposition that "knowing twice as much" was somehow a quantifiable and testable quality.
  
  Especially as one of the key elements of a properly written test is the wording of the questions - which is generally specifically written to confuse the "guesser". Very few people will actually (as the article presupposes) simply randomly choose an answer, most will attempt to read the question as a guide for t
- That's not the only mistake (Score:2)
  
  by DingerX ( 847589 ) writes:
  
  All the math in that paragraph is off. Not only should it read 26.5 in the first case, the other case should be 1+(99/2) - ((99/2)/2) (also known as 1+99/4)= 25.75,so that the "penalize fractionally for wrong answers" should give a result where the test results are even more obscured by noise. (see snippet below)
  
  So it's not just a "Typo" that distracts, it supports a completely faulty conclusion.
  
  One is left wondering what kind of mathematics background the author had. Also, noting the dittography earlier
- Re: (Score:2)
  
  by rts008 ( 812749 ) writes:
  
  I'm a moron, you insensitive clod!
  
  And, and...ohhhh!...Shiny!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Worthless (Score:4, Insightful)

Re:Worthless (Score:5, Insightful)

Re:Worthless (Score:5, Funny)

Re:Worthless (Score:5, Funny)

Re:Worthless (Score:5, Funny)

Re:Worthless (Score:5, Funny)

Re: (Score:3, Funny)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re:Worthless (Score:4, Funny)

Re:Worthless (Score:5, Funny)

Re:Worthless (Score:5, Insightful)

Re: (Score:2, Insightful)

Re:Worthless (Score:5, Insightful)

Re: (Score:3, Interesting)

Re:Worthless (Score:5, Insightful)

Re:Worthless (Score:5, Insightful)

Re:Worthless (Score:5, Interesting)

Re:Worthless (Score:4, Insightful)

Re: (Score:3, Insightful)

Re:Worthless (Score:4, Insightful)

Re:Worthless (Score:4, Informative)

Re:Worthless (Score:4, Funny)

Re:Worthless (Score:5, Insightful)

Re:Worthless (Score:5, Informative)

Re: (Score:3, Insightful)

Re:Worthless (Score:5, Interesting)

Re:Worthless (Score:5, Interesting)

Re: (Score:3, Informative)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Comment removed (Score:5, Interesting)

Measurement and item response theory (Score:4, Interesting)

The problem there (Score:5, Interesting)

Re:The problem there (Score:4, Insightful)

Re: (Score:2, Insightful)

Re:Worthless (Score:4, Funny)

Re: (Score:3, Insightful)

Cartel Tests (Score:3, Interesting)

Re: (Score:2, Insightful)

Sometimes guessing is a good thing (Score:3, Interesting)

The real problem with multiple choice tests (Score:2)

When I was a boy... (Score:4, Insightful)

Yuck (Score:2)

There may be unanswered questions (Score:4, Interesting)

Re: (Score:2, Funny)

Re: (Score:2)

Re:There may be unanswered questions (Score:4, Insightful)

Re: (Score:2)

Re: Yuck (Score:3, Insightful)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

The fallacy of penalizing guessing (Score:2)

Statistical exam using Multiple choice (Score:2)

Re: (Score:2)

Education in taking the test (Score:5, Insightful)

Re: (Score:2)

Not Worthless (Score:3, Insightful)

I had a teacher... (Score:5, Funny)

Evil profs rock. (Score:3, Interesting)

Re:I had a teacher... (Score:4, Insightful)

My experience (Score:5, Interesting)

Re: (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Well, that explains it. (Score:5, Funny)

Mutliple choice is bad to test knowledge anyway (Score:3, Insightful)

What? (Score:2)

Re: (Score:2)

I'll tell you about hard tests... (Score:2)

Re: (Score:2)

I find Mr. Feldzamen's post hard to believe. (Score:5, Interesting)

Re:I find Mr. Feldzamen's post hard to believe. (Score:5, Insightful)

Re:I find Mr. Feldzamen's post hard to believe. (Score:5, Informative)

Disturbing (Score:5, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Informative)

Choices (Score:3, Funny)