Researchers Say AI Transcription Tool Used In Hospitals Invents Things (apnews.com) 33

Posted by BeauHD on Tuesday October 29, 2024 @09:00AM from the grave-consequences dept.

Longtime Slashdot reader AmiMoJo shares a report from the Associated Press: Tech behemoth OpenAI has touted its artificial intelligence-powered transcription tool Whisper as having near "human level robustness and accuracy." But Whisper has a major flaw: It is prone to making up chunks of text or even entire sentences, according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text -- known in the industry as hallucinations -- can include racial commentary, violent rhetoric and even imagined medical treatments. Experts said that such fabrications are problematic because Whisper is being used in a slew of industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos.

The full extent of the problem is difficult to discern, but researchers and engineers said they frequently have come across Whisper's hallucinations in their work. A University of Michigan researcher conducting a study of public meetings, for example, said he found hallucinations in eight out of every 10 audio transcriptions he inspected, before he started trying to improve the model. A machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed. A third developer said he found hallucinations in nearly every one of the 26,000 transcripts he created with Whisper. The problems persist even in well-recorded, short audio samples. A recent study by computer scientists uncovered 187 hallucinations in more than 13,000 clear audio snippets they examined. That trend would lead to tens of thousands of faulty transcriptions over millions of recordings, researchers said. Further reading: AI Tool Cuts Unexpected Deaths In Hospital By 26%, Canadian Study Finds

Researchers Say AI Transcription Tool Used In Hospitals Invents Things

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 33 Comments Log In/Create an Account

Comments Filter:

Testing Methodology? (Score:5, Insightful)

by Drethon ( 1445051 ) writes: on Tuesday October 29, 2024 @09:15AM (#64902531)

So what testing methods did OpenAI use to ensure this product would meet the appropriate mean time between faults for a medical environment?

- Validation Methodolgy. (Score:3, Insightful)
  
  by geekmux ( 1040042 ) writes:
  
  So what testing methods did OpenAI use to ensure this product would meet the appropriate mean time between faults for a medical environment?
  What medical environment, accepted this pathetic bullshit after finding the first three reports full of imaginary medical “problems”?
  Fault the controlled environment that should have never accepted a PT Barnum grade attempt at selling enhancing snake oil.
Not news. (Score:5, Funny)

by msauve ( 701917 ) writes: on Tuesday October 29, 2024 @09:16AM (#64902539)

>AI Transcription Tool Used In Hospitals Invents Things

They've been using that AI in the billing department for years.

- setting a low bar (Score:2)
  
  by Thud457 ( 234763 ) writes:
  
  near "human level robustness and accuracy."
  That's damning with faint praise. Have you met some people?
  - Re: (Score:2)
    
    by bugs2squash ( 1132591 ) writes:
    
    I think they just badly punctuated "near-humans"
What, again? (Score:1)

by Anonymous Coward writes:

Clearly they're also posting dupes [slashdot.org].
- Re: (Score:2)
  
  by thrasher thetic ( 4566717 ) writes:
  
  Clearly they're also posting dupes [slashdot.org].
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
Not a dupe? (Score:4, Informative)

by billybob2001 ( 234675 ) writes: on Tuesday October 29, 2024 @09:19AM (#64902547)

This is not a dupe, it's a transcription of https://tech.slashdot.org/stor... [slashdot.org]

- Re: (Score:2)
  
  by Culture20 ( 968837 ) writes:
  
  This is not a dupe, it's a transcription of https://tech.slashdot.org/stor... [slashdot.org]
  Might be interesting to play a telephone game with these LLM transcription services. See what every new hallucination brings. Then perform a triple modular redundancy transcription and see if that can succeed without error.
A possible mitigation (Score:1)

by Anonymous Coward writes:

Privacy rises above the other considerations. While the problem exists, it can't be solved by keeping the original audio.
Traditional TTS can be used to provide a transcription with more obvious failures. What they should do is produce both the LLM and TTS transcriptions, and then compare the two and highlight differences so that they can be made known.
At that point we won't have solved the issue, but we will be one step closer to the solution, and we will know where likely errors in the transcription proces
- Re: (Score:2)
  
  by viperidaenz ( 2515578 ) writes:
  
  TTS is Text To Speech, the opposite of dictation/speech-to-text
Readers Say AI Editors Used In /. Reposts Things (Score:4, Informative)

by TigerPlish ( 174064 ) writes: on Tuesday October 29, 2024 @09:22AM (#64902561)

Seriously, it hasn't been even 3 days.

Dupe Dupe Dupe (Score:3)

by JustAnotherOldGuy ( 4145623 ) writes: on Tuesday October 29, 2024 @09:24AM (#64902567) Journal

It's like deja vu all over again....
https://tech.slashdot.org/stor... [slashdot.org]

You don't say? (Score:2)

by CEC-P ( 10248912 ) writes:

We tested Copilot for some reason. It mishears one word and goes off to some weird places in the transcription of meetings, which is typically longer than the meeting itself. It has no idea what's important or what we're talking about. It pretty much just makes every sentence a bullet point then invents a bunch of BS we didn't even say.
medical people are captive of asinine procedures (Score:2)

by Big Hairy Gorilla ( 9839972 ) writes:

Medical people are well educated idiots. They spend all their time on ipads and computer screens trying to figure out which button to press or which field fo fill in.
Doctors don't think, they just follow procedures now.

The medical administrators have "MBA'd" the operation: They outsource their brain and all operational functions to these all-in-one corporate systems, and when they get cryptojacked, all hospital staff are slack jawed wondering what to do.

Best advice is don't get sick, because your needs are
- Re: (Score:2)
  
  by nightflameauto ( 6607976 ) writes:
  
  Most doctors and nurses aren't all that happy about this situation either. The last time my mom went in for a surgery was a day that the board was going to walk through the hospital and investigate procedure. The doctors were absolutely puckered. Every little thing had to be perfect, or else. It was ridiculous, seeing a hospital run like any other big business. It wasn't about taking care of the people that day. It was about presenting well for the board. We're well past the point where patient care takes p
  - Re: (Score:2)
    
    by Bongo ( 13261 ) writes:
    
    It's been said for a long time that machine-like thinking drains us of our intuition and other intelligences, especially the ones which are more in touch with contextual realities. Many things which are in essence good, like DEI movements, are done in a machine-like, blind, robocop "put down the weapon", self defeating way, because people aren't allowed to express intuitive contextual perceptions.
    - Re: (Score:2)
      
      by Big Hairy Gorilla ( 9839972 ) writes:
      
      for sure, I'm saying " the smarter the tool, the dumber the operator".. feel free to quote me...
      but it seems clear you're correct that people don't have intuition or creativity.. it's a type of learned helplessness...
      a guy at my health club says his kids have no idea where they live, you couldn't give them directions, because they rely on google maps for instance.
      Smart tools make you dumber... and create a dependency.
      That's why I sometimes joke I program with sticks and stones ... I only use bluefish and na
  - Re: (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    I think you are quite right about the profit motive overriding all other reasonable concerns. I can also accept that the front line health care workers in general are dismayed at having to work within these systems. We all are trapped in someone else's maze, not saying I'm exempt.
    
    But since when did outsourcing your whole operation make sense? Not much need for management when someone else is doing all the thinking. What these educated morons need to see is the downsides of the monoculture. They are supposed
    - Re: (Score:2)
      
      by nightflameauto ( 6607976 ) writes:
      
      MBAs will be the death of us all. Some of us more quickly than others. They'll kill the world for one more quarter if increased profits if we don't curb-stomp them out of our system. But apparently we're stuck in full worship mode when it comes to the worthless bastards. Profit above all. Greed is God.
      - Re: (Score:2)
        
        by Big Hairy Gorilla ( 9839972 ) writes:
        
        ha ha! That's inflation for ya...
        the saying use to be "they'll kill you for a dime" :-)
Corporate BS Generator (Score:2)

by devslash0 ( 4203435 ) writes:

What did we expect? When the sound recording quality drops, the model just wants to continue going with the usual corporate BS narrative because that's what it was trained on/for.
duplicate news (Score:1)

by Scythal ( 1488949 ) writes:

It's not interesting, and it has already been posted. https://tech.slashdot.org/stor... [slashdot.org]
Malfunction (Score:2)

by StormReaver ( 59959 ) writes:

LLMs do not hallucinate, as that is something that requires some kind of intelligence. LLMs malfunction, which is what this is doing.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
I prefer hallucinations vs doctor’s handwrit (Score:1)

by denisko ( 5946738 ) writes:

Manual inputs will also be prone to mistakes, especially since time per patient has been shrinking constantly with no improvement in sight. Freeing up medics’ attention to do other things might help correct such mistakes. Medical services is at least 50% bureaucracy, any help there will do miracles.
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Researchers = people who want attention (Score:2)

by WaffleMonster ( 969671 ) writes:

I still can't get over this exert:
"Researchers aren't certain why Whisper and similar tools hallucinate, but software developers said the fabrications tend to occur amid pauses, background sounds or music playing."
Did these "researchers" just ignore confidence scores and turn up the temperature of the model to 11? It is after all one of those articles cheerleading regulation. I'm sure that will lead to perfect STT.
"The prevalence of such hallucinations has led experts, advocates and former OpenAI employee
The "AI randomly starts to bull**** problem" (Score:2)

by 93 Escort Wagon ( 326346 ) writes:

[ ] Solved [X] Not Solved

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Researchers Say AI Transcription Tool Used In Hospitals Invents Things (apnews.com) 33

Researchers Say AI Transcription Tool Used In Hospitals Invents Things More Login

Researchers Say AI Transcription Tool Used In Hospitals Invents Things

Testing Methodology? (Score:5, Insightful)

Validation Methodolgy. (Score:3, Insightful)

Not news. (Score:5, Funny)

setting a low bar (Score:2)

Re: (Score:2)

What, again? (Score:1)

Re: (Score:2)

Re: (Score:2)

Not a dupe? (Score:4, Informative)

Re: (Score:2)

A possible mitigation (Score:1)

Re: (Score:2)

Readers Say AI Editors Used In /. Reposts Things (Score:4, Informative)

Dupe Dupe Dupe (Score:3)

You don't say? (Score:2)

medical people are captive of asinine procedures (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Corporate BS Generator (Score:2)

duplicate news (Score:1)

Malfunction (Score:2)

Re: (Score:2)

I prefer hallucinations vs doctor’s handwrit (Score:1)

Re: (Score:2)

Researchers = people who want attention (Score:2)

The "AI randomly starts to bull**** problem" (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot