Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Science

Research Summaries Written By AI Fool Scientists (scientificamerican.com) 59

An anonymous reader quotes a report from Scientific American: An artificial-intelligence (AI) chatbot can write such convincing fake research-paper abstracts that scientists are often unable to spot them, according to a preprint posted on the bioRxiv server in late December1. "I am very worried," says Sandra Wachter, who studies technology and regulation at the University of Oxford, UK, and was not involved in the research. "If we're now in a situation where the experts are not able to determine what's true or not, we lose the middleman that we desperately need to guide us through complicated topics," she adds. Researchers are divided over the implications for science. The chatbot, ChatGPT, creates realistic and intelligent-sounding text in response to user prompts. It is a 'large language model', a system based on neural networks that learn to perform a task by digesting huge amounts of existing human-generated text. Software company OpenAI, based in San Francisco, California, released the tool on November 30, and it is free to use.

Since its release, researchers have been grappling with the ethical issues surrounding its use, because much of its output can be difficult to distinguish from human-written text. Scientists have published a preprint2 and an editorial3 written by ChatGPT. Now, a group led by Catherine Gao at Northwestern University in Chicago, Illinois, has used ChatGPT to generate artificial research-paper abstracts to test whether scientists can spot them. The researchers asked the chatbot to write 50 medical-research abstracts based on a selection published in JAMA, The New England Journal of Medicine, The BMJ, The Lancet and Nature Medicine. They then compared these with the original abstracts by running them through a plagiarism detector and an AI-output detector, and they asked a group of medical researchers to spot the fabricated abstracts.

The ChatGPT-generated abstracts sailed through the plagiarism checker: the median originality score was 100%, which indicates that no plagiarism was detected. The AI-output detector spotted 66% the generated abstracts. But the human reviewers didn't do much better: they correctly identified only 68% of the generated abstracts and 86% of the genuine abstracts. They incorrectly identified 32% of the generated abstracts as being real and 14% of the genuine abstracts as being generated. Wachter says that, if scientists can't determine whether research is true, there could be "dire consequences". As well as being problematic for researchers, who could be pulled down flawed routes of investigation, because the research they are reading has been fabricated, there are "implications for society at large because scientific research plays such a huge role in our society". For example, it could mean that research-informed policy decisions are incorrect, she adds.
On the contrary, Arvind Narayanan, a computer scientist at Princeton University in New Jersey, says: "It is unlikely that any serious scientist will use ChatGPT to generate abstracts." He adds that whether generated abstracts can be detected is "irrelevant."

"The question is whether the tool can generate an abstract that is accurate and compelling. It can't, and so the upside of using ChatGPT is minuscule, and the downside is significant," he says.
This discussion has been archived. No new comments can be posted.

Research Summaries Written By AI Fool Scientists

Comments Filter:
  • by AmazingRuss ( 555076 ) on Friday January 13, 2023 @10:35PM (#63207446)
    ... adequate levels of bullshit. Now supercomputers can make it 24/7! We don't have to bullshit anymore.
    • Trump election stolen / Moon made of green cheese / flat earth affordable housing, and junk food causes weight loss - I cant wait to try out hard questions. I have tabloids form 1980 that aliens have landed, I guess that's why newspaper circulation went down. Undoubtedly some politicians 'pitches' will be ridiculed.
  • Bad writing (Score:5, Interesting)

    by Retired Chemist ( 5039029 ) on Friday January 13, 2023 @10:36PM (#63207450)
    This is partly a result of how bad most humans are at writing. Scientific papers seem to be intentionally written in the most obscure style possible. Most scientific papers are frankly gibberish to anyone outside of the particular field of study and sometimes to those in it. I cannot begin to count the number of times I have read a paper in my field and had no idea whether the work was of any value or not and that was even counting only the ones written in English by people whose first language it was.
    • by Anonymous Coward
      A combination of gatekeeping and chest-beating. Because "waste management technician" sounds a lot better than 'garbage man' just as 'ornithological genetic survey specialist' sounds a lot better than 'bird masturbator'
      • Re:Bad writing (Score:4, Interesting)

        by test321 ( 8891681 ) on Friday January 13, 2023 @11:59PM (#63207538)

        A combination of gatekeeping and chest-beating. Because "waste management technician" sounds a lot better than 'garbage man'

        The major question is to obtain accurate, factual, non-emotional words. If I had to write about a study about e.g. rats and disease spread in the city based on interview and practices of night workers, I'd write "solid waste truck operator". This is actually how these jobs are advertised, and describes it very well. I would also avoid the word "prostitute" (has become emotionally loaded, therefore an absolute no-go for a science discussion), also not "sex worker" (if not a legally recognized concept in the area in question), but "people who use paid sex as a source of income". It's not about making it sound better, but plain, factual, descriptive, un-emotional.

        I don't see it as gate keeping because there is no intention in preventing understanding. Just the exercise of scientific publication never was intended for laymen. It's a recent side effect of the global internet that people can now read scientific publications and came up with expectations of being able to understand them. But this is not for them. When a big paper is published, universities ask the researchers to also provide a lay summary or an interview to be disclosed to many sites like phys.org.

        Complaining that scientific papers are difficult to read is like complaining that aeroplane controls are not easy to understand from the passenger point of view. (sarcasm:) Pilots are gate keeping, they make the readings difficult to understand so we don't challenge their authority. While in reality the readings were just made to be efficient in the context of flying aircraft. There is no voluntary obfuscation, just the need to make them intelligible to laymen was never considered and would be counterproductive.

        • Yes, papers need to be as neutral & scientific as possible. The bad writing that springs to mind for me is the papers that use overly vague & jargon-loaded rhetoric to obfuscate what Daniel Willingham calls "Bubba wisdom," i.e. platitudes & stuff your grandma might say paraphrased into fancy overly elaborate pseudo-sciency jargon. There's a lot of it around. In such cases, ChatGPT is actually quite good at converting the elaborate jargon into plain language.
    • Re:Bad writing (Score:4, Interesting)

      by test321 ( 8891681 ) on Friday January 13, 2023 @11:13PM (#63207482)

      This is partly a result of how bad most humans are at writing.

      Also, partly the result at how bad common languages are to express logical statements. Accuracy in logical statements leads to complicated formulations. There has been debates regarding the covid vaccine that could be traced to normal people genuinely not understanding what scientists had written, despite being the most accurate formulation possible. The thing is made worse by most scientific papers being written in English, a language known to allow compacity at the expense of accuracy. Idiomatic English is terribly ambiguous compared to any of the Latin-based languages. Scientific papers and more than that, patents, need to be made accurate, leading to unnatural formulations that regular people will find obscure.

      • I know you're trying to get a point thru but I just can't see it. Maybe if you rephrase it as a metaphor?
        • Thanks for the opportunity to clarify. I'll try bullet points. OP said that some scientific texts undistinguishable from AI gibberish because humans being at bad at writing (using their own language, even native speakers).

          I venture another possible explanation for the same observation, following this rationale:
          * conclusions of scientific articles often require complex logical connections;
          * natural languages are not well equipped to express these complex logical connections;
          * natural languages favour constru

          • I think the exception are Linguistics... those folks really write in a manner that exacerbates the language's ambiguity. It's almost like reading a bible, you either believe in it or not.
    • It's like that because vernacular languages is filled with biases, emotionally laden words, multiple possible interpretations, colloquialisms, and god knows what else.

      Subtract the cruft and you get difficult, stilted prose that, while problematic to read, is more solid when dissected.

      Demanding that it read easier is demanding it be dumbed down.

      • by HiThere ( 15173 )

        Sorry, but no. SOMETIMES one can't write things in all of accurate, objective, and intelligible. Usually one can. But doing that is not easy, and people aren't taught how to do it. (And it will be different depending on exactly what one is being precise about. E.g. it's POSSIBLE to write clearly, accurately, and precisely about tensor calculus, but just try to find a text. For one thing, it would likely triple the length of the book, perhaps more. I've occasionally translated individual paragraphs fo

    • Scientific papers seem to be intentionally written in the most obscure style possible.

      No, they are written in a technical style using specialized vocabulary that allows them to communicate complex concepts in a concise and accurate manner. Yes, the use of this technical and specialized vocabulary means that they can be hard to interpret for a random member of the public but they are not written for public consumption but rather for other researchers in the field.

      That's also likely what will trip up ChatGPT since it probably has not got a training database deep enough to allow it to use t

    • If you mean social sciences papers, I agree that a lot of authors do obfuscate weak ideas & poor reasoning with overly complex language.

      But it doesn't matter if a paper was written with AI, typewriter, or a quill. This is science; it's the content of the writing that matters. If AI can write easier to understand abstracts & point summaries of researchers' work, where's the problem?
      • by HiThere ( 15173 )

        The problem is that ChatGPT isn't accurate. It invents sources, it invents data.

        If ChatGPT were accurate and honest, it would be a great application for turning lab notes into both research papers and popularizations. But it's not.

        • Yeah, it'll invent stuff if you tell it to/don't tell it not to. If you determine the info to be reported, it sticks to that. You just have to design the prompts & give the context, genre, info, etc.. Why are some people so singularly focused on only using ChatGPT for cheating/making stuff up?
    • Interesting, I am in neuroscience and I rarely run into that situation, and when I do, I know that the authors are not good scientists. Obviously you can't write highly technical papers for a general, non-science audience, but I find most science articles to be relatively easy to read and comprehend. There is of course the situation where English is not the first language of the authors, so the writing is awkward, but usually still quite understandable. I rarely see articles in good journals that are writte

  • by gweihir ( 88907 ) on Friday January 13, 2023 @10:40PM (#63207456)

    Abstracts do not contain proofs, full or longer argumentations, reference work, etc. Abstracts merely serve as sort-of extended keyword list to people can find out fast whether they want to invest time into a paper or not. Hence recognizing from an abstract whether something is real or fake is pretty much impossible unless you are deep enough into a fiel that you can spot claims that do not make sense or are very unrealistic. That still does not make the paper fake, many low-quality publications do this as well. The things you see as paper reviewer are often quite horrible and some of them look like nobody with an actual clue was involved anywhere in the writing. Incidentally, anybody competent will want to write that abstract themselves, because why invests weeks or months into writing a paper only to have it end up with a bad abstract that would have cost you a few hours to write it yourself.

    So this basically means nothing. Just more really low-hanging fruit used to bolster a fake narrative what Artificial Ignorance supposedly can do. It cannot. It can fake somet things, but they are about as good as other fakes. They are not the real thing.

    • It's actually troubling that the person quoted in the summary thinks this is problematic, because it means they think the conclusions of a paper should be trusted depending on how well-written the abstract is. That's not how genuine science works at all.
      • by gweihir ( 88907 )

        I completely agree to that.

      • by SETY ( 46845 )

        Agreed.
        I would add that if you publish a paper in a reputable journal and no one know who you are in the field and your paper is “important”; it will be scrutinized. ChatGPT doesn’t change this.

    • Yup. This is the equivalent of "... in mice" medical research, it's low-hanging fruit that avoids addressing any of the difficult issues. Every abstract ever is basically "this is a problem we've invented...uh, identified, we claim no-one else has solved it even if they have, our solution is X, it works really well (in mice)". OK, that's a bit cynical but the less cynical version is pretty much every academic paper abstract ever. Getting ChatGPT to spit this stuff out isn't a big deal.
  • by davide marney ( 231845 ) on Friday January 13, 2023 @10:41PM (#63207458) Journal

    AI generates nonsense. It's high-sounding nonsense that looks good, but isn't good. ChatGPT, especially, just assembles statements from all over, with the end result that it looks like it's just making stuff up. In reality, we're "making stuff up" and Chat GPT is just monkey-see, monkey-do copying it.

    The solution is to recreate the experiment, which is what scientists are supposed to do anyway. Not just "peer review" it, but replicate it. That's something no AI will be able to do.

    But even if it happened to produce something actually verifiable, then all the better. Again, it's not authoring, it's mining.

    • The solution is to recreate the experiment, which is what scientists are supposed to do anyway. Not just "peer review" it, but replicate it. That's something no AI will be able to do.

      What experiment? This is a threat to authors of such seminal papers as Emo-Cognitive Explorations of White Neurosis and Racial Cray-Cray [semanticscholar.org]. It used to be that making up papers required some effort [wikipedia.org] but now anyone can be a brilliant scholar like Ibram X. Kendi.

      Troubling times.

      • by xevioso ( 598654 )

        On the contrary, authors like the pre-eminent Doug Zonker, who has written a seminal paper on the lowly chicken, has nothing to fear from AI authors.

        Behold:
        https://isotropic.org/papers/c... [isotropic.org]

      • This kind of thing isn't at all new, and isn't confined to academentia. As an example, look at href="https://en.wikipedia.org/wiki/Naked_Came_the_Stranger">Naked Came the Stranger, published in 1969. I'm sure that there are other, earlier examples, but that's the only one I can think of at the moment.
    • For many types of science, that isn't easy. The experiments at CERN / ATLAS / CMS require billions of $ of hardware and a decade of beam time. Similar for astrophysics experiments. Even drug trials can require years and very large costs.


      that isn't happening yet, but the potential is there
  • "Scientists unable to tell previously published abstracts from AI abstracts."
    Yes, that's what a large language model like GPT3 was built for. That's what it's supposed to do, get so good at predicting text it can create texts indistinguishable from the ones it was trained on.

    This is like being shocked at www.thispersondoesnotexist.com
  • "An anonymous reader" posted the article? You can't fool me SlashDot. That really was from a bot ... wasn't it?

  • I've been posting as a robot for decades! Dark Energy!

    --
    Sensual pleasures have the fleeting brilliance of a comet; a happy marriage has the tranquillity of a lovely sunset. - Ann Landers

  • by xevioso ( 598654 ) on Friday January 13, 2023 @11:41PM (#63207524)

    ""The question is whether the tool can generate an abstract that is accurate and compelling. It can't, and so the upside of using ChatGPT is minuscule, and the downside is significant," he says."

    Well, it can't yet. But this is an early iteration of the tool. And it will just continue to be iterated upon, and *that* is what should worry scientists.

    • ""The question is whether the tool can generate an abstract that is accurate and compelling.

      Well, it can't yet. But this is an early iteration of the tool. And it will just continue to be iterated upon, and *that* is what should worry scientists.

      Why should scientists be worried by this? If the AI can generate a compelling summary more expressive and accurate than what the authors could write by themselves, all the better.

      What matters is that the claims made in the paper are true, and for that, the paper needs peer review and replication, anyway; so using AI for copywriting creates no problem on its own.

      • by HiThere ( 15173 )

        They should worry because it fabulates. It really can't do anything else. If it could take lab notes as input and turn them into a paper, it would be useful. If it could take those same notes and also write a popular version, it would be extremely useful. But it can't.

        When it can, then it will have a useful place in society, even though that's still a long way from being a "true AI", much less an AGI.

        • If you're publishing with your name a text written by an AI without reviewing it first, of course you should be worried; same as of you published what an unreliable intern wrote for you.

          Doesn't mean that delegating part of the work either to an AI or an intern is a bad idea, as long as you verify what was written. The problem is created by your sloppiness and carelessness, not the tool.

  • Scientists could go down rabbit holes of fake ai generated papers?

    How about scientists going down rabbit holes of fake p-hacked human generated papers? My wife wasted a third of her time in grad school chasing down something that turned out to be not reproducible. No robots involved, just desparate academics publishing anything that pops half a hair above a deliberately depressed noise floor. She also had a hell of a time checking the "I've published" checkbox with her null result because, surprise surprise

    • by HiThere ( 15173 )

      AIs won't be free of biases. I believe that's theoretically impossible, though I wouldn't know how to prove it. AIs are based around heuristics, and all heuristics are biased. They are biased in a way that is deemed to more frequently produce the desired result, but that's still bias. And that is the precise bias that you are describing. Journals aim to optimize their "status to publish in", and what you describe is a necessary result of aiming for that goal.

  • Currently, the output uses human-generated language models. So they reflect our biases. What happens when we get to the point - and it's probably just a few years away - when most of the content online is " AI " generated? And used to generate further LLMs?

    My conclusion: everybody will return to /.

    • Most of the content online is already generated by bots and such. But generated content isn't high quality content.

      If you currently have a business model that depends on other people generating content for you (i.e., 80% of the Internet), then your value is going take a nose dive as these types of things get better.

  • by dixonpete ( 1267776 ) on Saturday January 14, 2023 @12:56AM (#63207590)
    When I ask questions now invariably at least several answers will have been generated by ChatGPT. How do I know? Simple. I put my question into ChatGPT before posting on Quora. There's some variation but the first lines of ChatGPT answers tend to be identical no matter how many times you regenerate.
    • by quenda ( 644621 )

      There's some variation but the first lines of ChatGPT answers tend to be identical no matter how many times you regenerate.

      I am not sure about the specific numbers, but it is possible that some answers on Quora are generated by models like ChatGPT. Quora is a platform where users can ask and answer questions, and it is possible that some of the answers come from AI-based systems like language models, which can generate human-like text. However, it's important to note that Quora also heavily moderated by human editors, so not all answers are generated by AI.

    • They won't tell us what bulk text ChatGPT was trained on, so it might well have been trained on Quora.

    • When I ask questions now invariably at least several answers will have been generated by ChatGPT.

      Right now, ChatGPT is free. Wait until they charge what it costs them to train and run the system and this particular problem will go away all by itself.

  • There is a risk here that Chatbot will be abused; the question is whether existing checks can prevent a serious problem. In theory the authors of a paper and their institutions should be guaranteeing some degree of accountability. Sadly the pressure to publish and a breakdown of meaningful oversight means that this is not anything like as sure as it should be. Part of the solution to the present rash of dubious material therefore lies in making the authors' institutions suffer a clear penalty for their asso

    • by HiThere ( 15173 )

      That's not a risk, that's a certainty, because it's already happening. There's a question as to how serious that problem will become.
      In the particular domain referenced by the article this may not be happening yet, but one can be certain that it will be. Whether it will become a serious problem in that domain isn't clear.

      It's also not clear in what way ChatGPT will be evolved. It would be nice if it could stop fabulating, while still continue being able to generate fiction, but that may be impossible wit

  • It is unlikely that any serious scientist will use ChatGPT to generate abstracts.

    It is however extremely likely that those with a particular axe to grind will use it to generate large amounts of "academic research" to quote from in order to promote and support their conspiracy theories -- from political posturing through to anti-vax, chem-trails, 5G ...

    It's getting that people are being swamped with "information" of varying quality and becoming less discriminating in assessing it - preferring to go with gut

  • by Anonymous Coward

    The fact that AI is doing it is pretty much irrelevant. It would be the exact same issue if you trained a bunch of kids to write scientific papers in exchange of sweets.

    • Ok, but if you instead paid them minimum wage or college credit, it would be perfectly legal. The vast majority of scientific research is published under the names of people who had nothing to do with the research. The industry widely accepts putting your name on someone else's work.

  • by namgge ( 777284 ) on Saturday January 14, 2023 @06:22AM (#63207906)
    The examples from ChatGPT I've seen contain poor science written written about in good English. Scientific papers written by humans typically have good science written about in poor english. It's not difficult to tell the difference if you know anything about the field that's being written about.
  • ChatGPT is not running around writing abstracts on its own.

    It is used as a tool by actual people - some of those might be scientists. Like with any other tool, it is used to speed up work - and like with any other tool (say a lathe in a machining shop), if the output doesn't fulfill your quality requirements, you fine tune it yourself or generate a new output.
  • Journals shouldn't be publishing until the results have been reproduced by another team. It doesn't matter if the AI wrote the paper. If the results are reproducible, it's good science.

  • A person who has only read the abstract and accepts it unquestioningly without knowing any details about the methodology or data is not a scientist.

  • There is no reason why you should be able to determine whether research is correct based on reading the abstract.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (10) Sorry, but that's too useful.

Working...