Become a fan of Slashdot on Facebook

Microsoft's New AI Tool Outperforms Doctors 4-to-1 in Diagnostic Accuracy (wired.com) 70

Posted by msmash on Monday June 30, 2025 @01:20PM from the pushing-the-limits dept.

Microsoft's new AI diagnostic system achieved 80% accuracy in diagnosing patients compared to 20% for human doctors, while reducing costs by 20%, according to company research published Monday. The MAI Diagnostic Orchestrator queries multiple leading AI models including OpenAI's GPT, Google's Gemini, Anthropic's Claude, Meta's Llama, and xAI's Grok in what the company describes as a "chain-of-debate style" approach.

The system was tested against 304 case studies from the New England Journal of Medicine using Microsoft's Sequential Diagnosis Benchmark, which breaks down each case into step-by-step diagnostic processes that mirror how human physicians work. Microsoft CEO of AI Mustafa Suleyman called the development "a genuine step toward medical superintelligence."

This discussion has been archived. No new comments can be posted.

Microsoft's New AI Tool Outperforms Doctors 4-to-1 in Diagnostic Accuracy

Load All Comments

Search 70 Comments Log In/Create an Account

Comments Filter:

Same shit, different day (Score:3)

by paul_engr ( 6280294 ) writes: on Monday June 30, 2025 @01:26PM (#65486404)

Great we heard the same shit five years ago

Share
twitter facebook
- - Re: (Score:2)
    
    by shanen ( 462549 ) writes:
    
    Not a very constructive FP with a vacuous Subject, too. Were you just seized by the uncontrollable urge to FP something?
    I have three linked takes.
    The first take is that diagnosis is quite difficult. I think that is partly a matter of excessive specialization to deal with the overload of medical knowledge, but one of the negative repercussions is that many doctors avoid diagnoses. Also related to the flawed economic model, but it's relatively safe (and too profitable) to treat the symptoms without worrying too much about diagnosing the cause. Until the cause becomes so serious that there is no difficulty at all in recognizing what is killing the patient.
    Second take is that the AIs don't care about making mistakes. No human sense of shame or uncertainty or perhaps even humility or anything else that might make the human doctors hesitant.
    Third take is the psychological harm to the doctors. You might they they deserve some comeuppance for their bad attitude in the past, though I think that's unfair to most doctors. However I think this is yet another AI thing that makes people feel bad. My new joke involves the need for CMINT for the "applied psychologists" who are installing so much new software in human beings. In this case the software under attack (called upgrade?) is the patients' trust in the physicians.
    Websearched CMINT and see that I need to explain it meant "Configuration Management, Integration, and Test" from my ancient days at TI with the last Lisp machine. Big complicated project but I was hired by the CMINT section that was supposed to help the parts fit together without making things worse... (But so long ago that I can only recall a few details about three of the biggest bugs I found way back then. At that time a mere half million transistors on a chip was at the leading edge...)
    My, oh, my. I certainly upset some censorious "moderator" AKA troll with that one. Why?
    If I had to guess on the money side, maybe it's not harsh enough on "money in medicine"? Or someone is worried about potential interference with the psychological manipulations of the applied psychologists AKA marketing droids?
    Is that motivation enough for another rant against Amazon? I'm pretty sure I made some more notes the other day, focusing on the manipulative side of review selection...
    But I actually like it when t
- Assisting, not replacing (Score:4, Insightful)
  
  by drnb ( 2434720 ) writes: on Monday June 30, 2025 @01:44PM (#65486456)
  
  Great we heard the same shit five years ago
  Software assisting doctors, radiologists, etc has been going on for decades. For example bringing that suspicious "blob" in the medical imagery to the attention of the radiologist or doctor.
  
  There have also been "AI" expert systems for medical diagnosis for decades. Again, assisting, not replacing doctors.
  
  There have also been "AI" expert systems for medications, in particular drug interactions. Again, assisting, not replacing doctors and nurses.
  
  These new "AI" systems will likely continue as the previous "AI" systems, assisting, not replacing.
  
  Parent Share
  twitter facebook
  - Also we have caught AI (Score:2)
    
    by rsilvergun ( 571051 ) writes:
    
    Taking shortcuts. I don't know what the current state is but the last time I looked which admittedly was when it was called machine learning instead of AI the system was fed a bunch of data consisting of things humans had already diagnosed and initially it looked like it was doing an amazing job until somebody pointed out that it had figured out that the slides that had the diseased parts also happened to have some framing that the slides with the healthy parts didn't have and that the AI was just using tha
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      "AI" is kind of a big bucket marketing phrase that tons of things get tossed into. It's been that way for decades. Possibly from the start. ML is just the latest tool, it saves the developer some time with respect to having to develop custom algorithms to address an "AI" topic problem. Computer Vision for example, doing "AI" with human developed algorithms for decades. ML is an awesome tool to add to the mix, but the cost? Not knowing how the decision was made?
      
      I don't know if the following story is real
  - Re: (Score:2)
    
    by eneville ( 745111 ) writes:
    
    There have also been "AI" expert systems for medical diagnosis for decades. Again, assisting, not replacing doctors.
    This. Been hearing this for a long time.
    Text books didn't replace doctors, expecting someone to self-diagnose from a textbook sounds unreliable. Self-diagnosing from a expert system too. An AI isn't much different, just a search engine with a medical index. I see it as the textbook, it might point you in a direction, but it doesn't have experience.
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      There have also been "AI" expert systems for medical diagnosis for decades. Again, assisting, not replacing doctors.
      This. Been hearing this for a long time.
      Text books didn't replace doctors, expecting someone to self-diagnose from a textbook sounds unreliable. Self-diagnosing from a expert system too. An AI isn't much different, just a search engine with a medical index. I see it as the textbook, it might point you in a direction, but it doesn't have experience.
      In the area of software development, it's pretty much a replacement for looking things up in a textbook too. What algorithm is better for this sort of data, getting sample code for some well known and studied algorithm. Sometimes it can recognize a problem than can be solved by gluing together several such well known and document algorithms. Which reading industry literature can probably inform you of as well.
  - Re: (Score:2)
    
    by larryjoe ( 135075 ) writes:
    
    Software assisting doctors, radiologists, etc has been going on for decades.
    I see a similar tension between humans and machines for calling baseball balls and strikes. The machines are pretty accurate and far more consistent. There may be a few umpires that are better than the machines, but the luck of the draw determines which umpire you get for a given game.
    The umpire union is influential, and obviously the umpires don't want to lose their jobs. The compromise right now is to have the machines call every pitch but only inform the umpires secretly. The umpires can use or ignore th
    - Always point to someone with deeper pockets (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      Always use the machines as input, but go ahead and act like it's completely your diagnosis.
      Until the malpractice suit, then it's Google's fault. Always point to someone with deeper pockets. :-)
- Re: (Score:2)
  
  by SoftwareArtist ( 1472499 ) writes:
  
  Take all studies like this with a grain of salt. A doctor doesn't diagnose a patient by reading a case study. They do it by talking to the patient, examining them, deciding what tests to order, etc. This is a contrived comparison that has little connection to how doctors actually work.
  - No actual patients were diagnosed (Score:2)
    
    by Geoffrey.landis ( 926948 ) writes:
    
    Note that no actual patients were diagnosed, so it's impossible to say that the AI is better than actual doctors in diagnosing real live human beings.
    Case studies are cases that are deliberately selected to be not like what doctors see every day (because why would doctors want to read about what they see every day?). But actual doctors have to diagnose what they see every day. If the AI is trained on case studies where the patient has an incredibly rare disease that hits one person in twenty million, the AI
    - Re: (Score:2)
      
      by yababom ( 6840236 ) writes:
      
      If the AI is trained on case studies where the patient has an incredibly rare disease that hits one person in twenty million, the AI will be biased to find outré and unusual diseases, and miss "this patient has the flu."
      AI> Maybe it's lupus?...
    - Re: (Score:2)
      
      by dsgrntlxmply ( 610492 ) writes:
      
      NEJM exceptional case examples from memory: 1) patient presents with spectacularly high cholesterol, is eventually discovered to compulsively eat over 6 dozen eggs per day; 2) cluster of terrible cases where brain imaging revealed large scale destruction of brain, from domoic acid contamination of shellfish; 3) epidemiological tracing of tuberculosis transmission from an infected person on an airline flight, including seat maps showing locations of index case and infected persons.
  - Re: (Score:2)
    
    by LetterRip ( 30937 ) writes:
    
    > Take all studies like this with a grain of salt. A doctor doesn't diagnose a patient by reading a case study. They do it by talking to the patient, examining them, deciding what tests to order, etc. This is a contrived comparison that has little connection to how doctors actually work.
    LLM's can do DDX's based on getting the primary complaint and asking follow up questions. It will then provide what tests to order etc. While it can't do a physical exam, they are better than doctors at all other aspects.
- Re: (Score:2)
  
  by Waffle Iron ( 339739 ) writes:
  
  Great we heard the same shit five years ago
  I won't be impressed until the AI has better recommendations than 4 out of 5 dentists.
- Re: (Score:2)
  
  by YetAnotherDrew ( 664604 ) writes:
  
  Great we heard the same shit five years ago
  Only five? I worked with people over 20 years ago who did this. No AI was sued out of existence or lost its "license." No doctors were replaceable.
The question I have (Score:1)

by JamesTRexx ( 675890 ) writes:

is what was the difference between the LLMs and the doctors? Especially when they claim the LLMs did things step by step as those doctors did.
Were the doctors under the usual work pressure, fatigued, etcetera?
- Re:The question I have (Score:4, Informative)
  
  by Tony Isaac ( 1301187 ) writes: on Monday June 30, 2025 @02:00PM (#65486516) Homepage
  
  LLMs are only a part of the diagnostic process in this research. Other types of AI were included, AI specifically trained in diagnostics. The LLM part was simply the English "user interface" that was the bridge between the "patient" and the diagnostic engine. https://microsoft.ai/new/the-p... [microsoft.ai]
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by JamesTRexx ( 675890 ) writes:
    
    I still don't think that explains this bit: "which breaks down each case into step-by-step diagnostic processes that mirror how human physicians work"
    This sounds like either side used the same method, yet one side is somehow much better. This makes it look like the physicians are incompetent.
    - Re: (Score:2)
      
      by Tony Isaac ( 1301187 ) writes:
      
      The success of AI (supposing it's as good as the hype) doesn't make physicians look incompetent. That would be like saying that the success of chess computers make Gary Kasparov seem incompetent. Nobody thinks that, even though chess computers today can beat him.
      Computers / AI are able to hold a much larger set of data in memory, than humans can. I have no doubt that soon, humans *will* seem incompetent compared to AI, when it comes to diagnosing diseases. And yet, human doctors will still be needed to supe
- Re: (Score:2)
  
  by Rujiel ( 1632063 ) writes:
  
  Maybe the difference is, the LLMs have WebMD
Actually a good use case (Score:5, Insightful)

by SouthSeb ( 8814349 ) writes: on Monday June 30, 2025 @01:34PM (#65486424)

In spite of Suleyman's exaggeration, this is actually a good use case of "AI" exactly because it's not about intelligence.
Diagnostics are a very algorithmic activity and requires the memory of an immeasurable repertoire of medical literature. Doctors could make use of this tool to quickly narrow and accelerate it, saving precious time and resources for patients.

Share
twitter facebook
Meaningless metric (Score:3)

by backslashdot ( 95548 ) writes: on Monday June 30, 2025 @01:34PM (#65486428)

Quality is more important than quantity. Who was missing the important diagnostic? As in, if the AI missed diagnosing people with cancer versus humans missing all the flu diagnosis. Which would you rather have?
Note, I haven't read the article .. just going by the headline. Just pointing out that just because the "error rate" of humans is higher doesn't mean humans are less useful than AI.

Share
twitter facebook
- Re: Meaningless metric (Score:2)
  
  by SeaFox ( 739806 ) writes:
  
  Quality is more important than quantity.
  Not in modern bean-counter business management. All the product/service has to do is pass the bar of "good enough" and how much of it you can accomplish is the primary metric after that. For many businesses now the good enough bar is just what the competitor offers. As long as they are not significantly better in the same price class there's no reason to do better.
  And since healthcare is a for-profit venture most places, that's what's used. How many people can we get in/out the door in a day?
- Re:Meaningless metric (Score:5, Insightful)
  
  by Tailhook ( 98486 ) writes: on Monday June 30, 2025 @01:54PM (#65486488)
  
  Quality
  This presumes we have quality. Do you believe that, without doubt? I don't. I have a lifetime of anecdotal evidence of failures by doctors, personally and among family, friends and others. Without (hopefully) inviting a deluge of corroboration, I can assure you the people reading this now can bury us in such stories.
  Beyond that, we are in desperate need of lower cost solutions for medicine. You're free to attribute the extreme costs we see however you wish, but finger pointing won't fix it: the powers and interests involved aren't listening. What is needed is a disruption, and this looks like a real possibility. I, at least, don't immediately dismiss it with AMA FUD.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by backslashdot ( 95548 ) writes:
    
    I wasn't advocating dismissing it at all. Just that we have to make sure all the metrics are correct and comparisons within full context before ditching something wholesale. I'm saying make sure we get it right, that's all.
    - Re: (Score:2)
      
      by Tailhook ( 98486 ) writes:
      
      I'm saying make sure we get it right
      I am saying I have no patience for the drearily predictable "quality" and "safety" FUD. There are severe problems in healthcare. Bad enough to risk neglecting our worship of medical authority. Bad enough to risk suffering possible unknown failures as an alternative to our chronic known failures.
      - Re: (Score:1)
        
        by MrNaz ( 730548 ) writes:
        
        While I accept your point, I feel it necessary to add that the problems you are referring to, excessive cost and poor quality, are American problems. The rest of the civilised world has low cost or free healthcare and doctors that aren't ground into apathy by the capitalist machine.
        So the rest of the world would like to be cautious because we like what we have. Unlike Americans we do have something to lose.
- Re: (Score:2)
  
  by sabbede ( 2678435 ) writes:
  
  Well, from this it sounds like we have neither quality nor quantity. 20% diagnostic accuracy is not good enough. That's low quality. The supply of doctors is sharply limited. That's low quantity.
Lots of details left out (Score:5, Insightful)

by marcle ( 1575627 ) writes: on Monday June 30, 2025 @01:35PM (#65486432)

For one thing, doctors only 20% accurate? I know they make lots of mistakes, but that figure seems suspiciously low, and the article (and links) seem light on the specifics.
After all, this is Microsoft tooting their own horn, and of course we believe every word. /s

Share
twitter facebook
- Re: (Score:3)
  
  by Tony Isaac ( 1301187 ) writes:
  
  More than likely, the problem is the lack of randomness in the cases selected for the study. Maybe they picked cases that were difficult for doctors, instead of cases that are *typical* for doctors.
- Re: (Score:3)
  
  by jacks smirking reven ( 909048 ) writes:
  
  They used Dr. Nick Riviera as the comparison.
  "now, the symptoms you describe point to 'bonus eruptus', it's a terrible disorder where the skeleton tries to leap out the mouth and escape the body."
  - Re: (Score:2)
    
    by Chris Mattern ( 191822 ) writes:
    
    Remember, if he kills you, you don't pay!
  - Re: (Score:2)
    
    by sabbede ( 2678435 ) writes:
    
    Hi Doctor Nick!
- Re: (Score:3)
  
  by Geoffrey.landis ( 926948 ) writes:
  
  For one thing, doctors only 20% accurate? I know they make lots of mistakes, but that figure seems suspiciously low,
  
  It's low because they were tested on puzzle cases that are deliberately selected to be hard.
  It's like saying most people are ok at commonplace arithmetic in everyday life. So how come their accuracy rate is only 20% in solving puzzles in The Scientific American Book of Mathematical Puzzles?
  - Re: (Score:2)
    
    by YetAnotherDrew ( 664604 ) writes:
    
    It's like saying most people are ok at commonplace arithmetic in everyday life. So how come their accuracy rate is only 20% in solving puzzles in The Scientific American Book of Mathematical Puzzles?
    The general population has been shown effectively innumerate.
    If you're a programmer, you're likely part of that population. I'd hate to leave anyone out.
    WTF is The Scientific American Book of Mathematical Puzzles?
    Yes, I looked it up, but that was my initial reaction. Why would Internet douchebag require strangers to not only know some obscure thing and also test and score better in it? What next? Do I need to know My Little Pony trivia, too? I'd hate to be ostracized by such a troll . . . (*gasp*)
"medical superintelligence" (Score:2, Funny)

by RitchCraft ( 6454710 ) writes:

Here, this one goes in your butt, and this one in your mouth..... no wait, THIS one goes in your butt and THIS one in your mouth. Don't forget to drink your Brawndo.
Only 20% for human doctors (Score:2)

by PuddleBoy ( 544111 ) writes:

I only skimmed the article, but am I the only person who thinks that, if we had a situation or field of diagnosis where doctors were only getting it right 20% of the time, we would throw some research/education/analysis at it? Because 20% correct (or 80% incorrect) seems kinda concerning and I would think would lead to a lot of brouhaha or lawsuits?
Maybe it's just me.
- Re: (Score:2)
  
  by CubicleZombie ( 2590497 ) writes:
  
  What do you call a doctor who graduated last in their class?
  Doctor.
- Re: (Score:2)
  
  by dgatwood ( 11270 ) writes:
  
  I only skimmed the article, but am I the only person who thinks that, if we had a situation or field of diagnosis where doctors were only getting it right 20% of the time, we would throw some research/education/analysis at it? Because 20% correct (or 80% incorrect) seems kinda concerning and I would think would lead to a lot of brouhaha or lawsuits? Maybe it's just me.
  I'm assuming this is based on edge cases, e.g. medical images where cancer was just barely starting to appear, situations where lupus is mistaken for rheumatoid arthritis, etc., in which case the human rate of correct diagnosis could indeed be very low, precisely because they were chosen from cases where humans had made mistakes before.
  If that is the case, then the question becomes whether the model is over-trained on these edge cases and would generate false positives, would miss obvious diagnoses, etc.
  - Re: (Score:2)
    
    by smooth wombat ( 796938 ) writes:
    
    situations where lupus is mistaken for rheumatoid arthritis,
    
    It's never lupus.* Until it is.
    
    * I wanted to post a video of House saying it's not lupus, but for some reason YT is now requiring me to sign to confirm I'm not a bot. Sorry for not posting the reference.
- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  I think the devil is in the details.
  Usually, the process of diagnosing a disease or problem, consists of a series of doctor visits, with follow-up tests, ruling out one possible condition at a time until the correct diagnosis is found.
  Maybe the 20% number refers to getting it exactly right on the first try. Maybe it's more about the selection of cases that were not random. But I agree, the percentages are suspect.
Are we sure the computers are so smart? (Score:2)

by kackle ( 910159 ) writes:

Jump to 4:10. [cbsnews.com]
Overloaded doctors (Score:2)

by dicobalt ( 1536225 ) writes:

This tool could really ease the load on the healthcare system, especially public health programs and emergency rooms.
Just to check - training and testing data ... (Score:2)

by drnb ( 2434720 ) writes:

The system was tested against 304 case studies from the New England Journal of Medicine
Just to check, the training of the relevant "AI" system was audited to make sure it did not see these cases during training?

Then again, if the doctors had a subscription to the NEJM then including in the training data would be fair.
Good progress, but... (Score:2)

by MpVpRb ( 1423381 ) writes:

...doctors don't make diagnoses based solely on text, they examine the patient.
After years of practice, doctors develop useful diagnostic instincts.
Also, the "304 case studies from the New England Journal of Medicine" are probably incomplete, written quickly to satisfy legal requirements, and may omit key insights.
There is a LOT more to medicine than just text
Diagnosing what? (Score:2)

by TheStatsMan ( 1763322 ) writes:

I'm not really interested in paying wired to read it.
Tested against its own training data (Score:2)

by ebunga ( 95613 ) writes:

Sure, that will do it.
- Re: (Score:2)
  
  by ByTor-2112 ( 313205 ) writes:
  
  Oh, damn, you beat me to it. There must be a near 100% chance this is what happened.
Testing the LLMs on their own training data (Score:2)

by ByTor-2112 ( 313205 ) writes:

I bet there is a near 100% chance the LLMs had all of these medical journals and the exact cases in their training data... Give a doctor the same "advantage" (re-diagnosing old cases) and I bet he will perform at better than 80%.
This is just a contrived AI "test" from the company that is desperate to sell you copilot. Like p-hacking.
Hard to do worse than doctors, tbh (Score:3)

by ihadafivedigituid ( 8391795 ) writes: on Monday June 30, 2025 @02:34PM (#65486656)

My experience, going back to the 70s, is that I have to be super assertive to doctors because I've had too many life-threatening fuckups and other nonsense.

Obvious shit like: my arm is broken and displaced, ER doc wanted to send me off with Tylenol and no X-ray. Or: yo-yo fever up to 104 degrees for three days, severe body pain etc etc ... doc says eh, you have some bug that's going around. I insist, doc takes chest x-ray showing large pneumonia spot deep in a lung. I could go on and on--I am only alive in spite of doctors, I swear.

Share
twitter facebook
That's not at all what they did (Score:2)

by thecombatwombat ( 571826 ) writes:

Read the actual article, what they did, was tune a system to get the best results possible from ~80 case studies based on rules they devised for success.
That is not, even a little, the same as having it evaluate patients.
These articles are just exhausting. The tech is cool, but no, they don't have an 80% accuracy in diagnosing patients compared to a doctor's 20%.
MS product performs fantastically (Score:2)

by newbie_fantod ( 514871 ) writes:

according to company research published Monday.

says MS funded study on how fantastic it performs
Sceptical but desperate (Score:2)

by devslash0 ( 4203435 ) writes:

Given how long it takes our health system in the UK to diagnose anything, I can see how many people would be sceptical about going to an AI doctors but many would give in out of sheer desperation.
If you're not familiar with how anything outside of your GP works (which is pretty much everything), any specialist care requires a referral. Because of massive backlogs everywhere, your wait to be seen for the first time by a specialist is usually anywhere between 3 and 12 months. Then, after the first visit they
AI vs DR (Score:2)

by gary s ( 5206985 ) writes:

Is that saying AI is good or that Doctors are BAD?
And yet... (Score:2)

by SlashbotAgent ( 6477336 ) writes:

And yet I still distrust the AI diagnosis versus the South Asian meat bag with the degree from University of Rwanda.
Testing against "New England Journal of Medicine" (Score:2)

by Troy Roberts ( 4682 ) writes:

Testing against the "New England Journal of Medicine" is a very poor test. Since, likely the LLMs were trained on that data. It would be much more interesting to test diagnosed patients. Have the LLMs diagnose and a doctor and then check which is correct.
Testing an LLM against training data is not a good test for the real world.
- Re: (Score:2)
  
  by sound+vision ( 884283 ) writes:
  
  I wonder what sort of legal framework (if any) exists to address this. Drugs are formally and rigorously studied before being loosed on the general population. Doctors are formally and rigorously trained.
  Perhaps the AI could be subjected to trials as a "medical device"?
  But even if they perform well in the trials, I'm not sure the conclusion implied here (less doctors needed) is correct. Someone's got to write the journals for the AI to train on. All the people who wrote the original journals will die at som
telemedicine (Score:2)

by ZipNada ( 10152669 ) writes:

It seems like this could lead to a big advantage for telemedicine and potentially be much cheaper for the customer. "getting to the diagnosis and getting to that diagnosis very cost effectively", I like that.
"replicates the way human physicians diagnose disease—by analyzing symptoms, ordering tests, and performing further analysis until a diagnosis is reached"
Some things obviously can't be properly seen via webcam and may not be a candidate for this but for many ailments it could work well, at least a
They are great at detecting Zebras (Score:2)

by gurps_npc ( 621217 ) writes:

My father had a saying about when you hear hoof beats, its not Zebras. You look for horses first.
There is not one type of accuracy,but two. Chance of false positives and chance of false negatives. Most of the time you care more about false positives (hey, this test says you have deadly disease when you don't), rather than false negatives (sorry we failed to catch the fact that you have the disease).
Example: Deadly disease is rare - only happens 4% of the time. Out of 1000 people, 40 people actually have
- Re: (Score:2)
  
  by dsgrntlxmply ( 610492 ) writes:
  
  Sometimes the hoofbeats are an actual zebra: Zeke, the zebra [tennessean.com]
Doctors are so full of it (Score:1)

by vladoshi ( 9025601 ) writes:

I have never had the same diagnosis from 2 doctors. I had a room of doctors completely disagree on a treatment. Some said bone death imminent without medicine while others thought it will resolve itself. Medicine is definitely not an exact science and needs serious changes in attitude to start healing people instead of focussing on how to charge more money.
Were they that good or trained on similar problems (Score:2)

by mattr ( 78516 ) writes:

Made it to the BBC too. tldr, sounds great but:
1. They apparently used puzzle cases that would be hard for humans. Is AI more on a par with humans for non-puzzle cases?
2. How sure are we that the quoted models have not already been trained on those puzzle cases, or other cases perhaps in medical school exams that were created based on knowledge of them? It sounds pretty suspicious for such a disparity.
3. If doctor sees two potential solutions with similar probability (maybe 60% vs 70%) they might pick eithe
BS, But also the Future (Score:2)

by RossCWilliams ( 5513152 ) writes:

The story has drawn an exaggerated conclusion from a limited study.
That said, I think it is obvious that AI should give a better diagnosis than a human doctor given the same information. The role of doctors is going to be valuable because of their ability to gather information and interact with patients, not their knowledge base. AI can consider every obscure possibility and never forgets what it knows. But its ability to gather accurate and complete information about symptoms from human beings is likely go

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Same shit, different day (Score:3)

Re: (Score:2)

Assisting, not replacing (Score:4, Insightful)

Also we have caught AI (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Always point to someone with deeper pockets (Score:2)

Re: (Score:2)

No actual patients were diagnosed (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The question I have (Score:1)

Re:The question I have (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Actually a good use case (Score:5, Insightful)

Meaningless metric (Score:3)

Re: Meaningless metric (Score:2)

Re:Meaningless metric (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Lots of details left out (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

"medical superintelligence" (Score:2, Funny)

Only 20% for human doctors (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Are we sure the computers are so smart? (Score:2)

Overloaded doctors (Score:2)

Just to check - training and testing data ... (Score:2)

Good progress, but... (Score:2)

Diagnosing what? (Score:2)

Tested against its own training data (Score:2)

Re: (Score:2)

Testing the LLMs on their own training data (Score:2)

Hard to do worse than doctors, tbh (Score:3)

That's not at all what they did (Score:2)

MS product performs fantastically (Score:2)

Sceptical but desperate (Score:2)

AI vs DR (Score:2)

And yet... (Score:2)

Testing against "New England Journal of Medicine" (Score:2)

Re: (Score:2)

telemedicine (Score:2)

They are great at detecting Zebras (Score:2)

Re: (Score:2)

Doctors are so full of it (Score:1)

Were they that good or trained on similar problems (Score:2)

BS, But also the Future (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals