'ChatGPT Detector' Catches AI-Generated Papers With Unprecedented Accuracy (nature.com) 38

Posted by msmash on Monday November 06, 2023 @02:00PM from the closer-look dept.

A machine-learning tool can easily spot when chemistry papers are written using the chatbot ChatGPT, according to a study published on 6 November in Cell Reports Physical Science. From a report: The specialized classifier, which outperformed two existing artificial intelligence (AI) detectors, could help academic publishers to identify papers created by AI text generators. "Most of the field of text analysis wants a really general detector that will work on anything," says co-author Heather Desaire, a chemist at the University of Kansas in Lawrence. But by making a tool that focuses on a particular type of paper, "we were really going after accuracy."

Desaire and her colleagues first described their ChatGPT detector in June, when they applied it to Perspective articles from the journal Science. Using machine learning, the detector examines 20 features of writing style, including variation in sentence lengths, and the frequency of certain words and punctuation marks, to determine whether an academic scientist or ChatGPT wrote a piece of text. The findings show that "you could use a small set of features to get a high level of accuracy," Desaire says. The findings suggest that efforts to develop AI detectors could be boosted by tailoring software to specific types of writing, Desaire says. "If you can build something quickly and easily, then it's not that hard to build something for different domains."

'ChatGPT Detector' Catches AI-Generated Papers With Unprecedented Accuracy

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 38 Comments Log In/Create an Account

Comments Filter:

Download any model you want... (Score:4)

by Rei ( 128717 ) writes: on Monday November 06, 2023 @02:02PM (#63984966) Homepage

... on HuggingFace and use that instead locally. There are thousands of them.
Training ChatGPT detectors will only deal with low-effort students. Heck, not even all, because there's a wide range of public-facing LLMs out there already.

- Re: (Score:3)
  
  by Tony Isaac ( 1301187 ) writes:
  
  The reason everyone is jumping on the ChatGPT bandwagon is because the quality of its training is vastly better than anybody else's. For evidence, compare answers from ChatGPT against Google Bard, you'll quickly see the difference. Sure, there might be all kinds of LLMs out there that you can run yourself, but just because it's an LLM doesn't mean it was trained well.
false gods (Score:2)

by groobly ( 6155920 ) writes:

Do not put your faith in false gods. "small set of features." Harrumph.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
False positives - 21% (Score:5, Informative)

by Dan East ( 318230 ) writes: on Monday November 06, 2023 @02:13PM (#63984992) Journal

According to the article referenced in the story, it misclassified 21% of human-generated paragraphs in one set. Thus it attributed 1 out of every 5 paragraphs written by humans as having been written by AI. Out of 50 entire documents written by humans, it misclassified 6 of them, or 12% false positives. That's pretty bad given such a large corpus of text to analyze (an entire document).
With such massive false positives I really don't know what the point in this is, or how useful it can be.
https://www.sciencedirect.com/... [sciencedirect.com]

- Re:False positives - 21% (Score:5, Insightful)
  
  by HiThere ( 15173 ) writes: <charleshixsn.earthlink@net> on Monday November 06, 2023 @02:21PM (#63985016)
  
  And it's aiming at the wrong target. It should be targeting fraudulent papers. If they're accurate, what does it matter who wrote them.
  
  - Re: (Score:2)
    
    by jacks smirking reven ( 909048 ) writes:
    
    True, at first glance I thought this was about papers written by students in which case it makes sense since the idea is a students grasp of material but in the case of peer-reviewed scientific literature I think it's good to disclose AI was used but as you said, facts are the facts and this could be a tool to help format and fill extrapolate. At the end of the day it's the author(s) reputation when they sign off on the final product.
    - Re: (Score:2)
      
      by quantaman ( 517394 ) writes:
      
      True, at first glance I thought this was about papers written by students in which case it makes sense since the idea is a students grasp of material but in the case of peer-reviewed scientific literature I think it's good to disclose AI was used but as you said, facts are the facts and this could be a tool to help format and fill extrapolate. At the end of the day it's the author(s) reputation when they sign off on the final product.
      Yeah, this seems like an odd target.
      I suppose they're worried about journals getting flooded with AI generated papers, but if your journal has an outside chance of accepting such a paper maybe your journal needs higher standards altogether (and what's the motive for submitting if it's always rejected).
      And I think that scientific literature is one of the places where LLMs have a legitimate important use case. A lot of researchers don't speak English as a first language, and even for some who do you couldn't
- - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      That's more than a little optimistic. There are very real limits to what these things can do that no amount of training, regardless of quality, can overcome. Unrealistic expectations can be dangerous, as we've already seen.
      If you're improving at all, it's because you're being more deliberate about your code and comments. You're trying to use your code and comments to communicate an idea, which is absolutely the right attitude to have, though it's not one you often see.
      If that's what you need, that's fine
- Re: (Score:3)
  
  by Chelloveck ( 14643 ) writes:
  
  Exactly my thought as well. They have the criteria for this backwards. The question isn't, "Is this written by an AI?" The question is, "Is this written by a human?" Their own numbers say they're getting that wrong about 10% of the time.
  Why is that important? Because if you're using it to detect fraud it's going to sound the alarm 10% of the time, even when there's absolutely zero fraud. That's way too many false positives when you consider people could lose their careers or grades based on this outpu
  - Re: (Score:3)
    
    by ewibble ( 1655195 ) writes:
    
    Its even worse than that say its only 1% of papers are picked up as fraud but you write say 100 papers, so your chances of being accused of cheating are 63%. Or put it another way out of every 100 honest papers submitted, there is a 63% chance you are going to falsely accuse someone.
    If its 10% your chances are 99.997% that you going to be accuses of cheating in one of your papers.
    Sure you can make it so the consequences of "cheating" low, but then why bother at all submit your AI generated paper and they s
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    Algorithms must halt in finite time.
- some things never change (Score:2)
  
  by hawk ( 1151 ) writes:
  
  I guess it's a couple of decades ago that the university I was at got excited and told us to use this wondrous new program to check for plagiarism.
  I tried it.
  Every single thing that it flagged was wrong.
  Most of what it flagged were properly done citations!
  In disbelief, I fed it a working paper of my own that had been on the web for several years, on multiple websites.
  It saw no problem . . .
  Moving to today, I see no reason to expect this type of check to work. Rather, it will start alarms race as we saw fou
  - Re: (Score:2)
    
    by Potor ( 658520 ) writes:
    
    Turnitin allows you to exclude citations. I find turnitin useful for detecting recycled papers, which I would miss. But I don't need it to detect plagiarism from professional sources. Those stick out like a sore thumb.
    LLMs provide a different challenge, but so far I have done alright in detecting them. The thing is, I can grade these as I want, without calling out the students. LLM papers are well written but with repetitive grammatical structures, start with a very clear thesis statement, and use the phras
    - Re: (Score:2)
      
      by Shadow of Eternity ( 795165 ) writes:
      
      If you find turnitin useful you're at absolutely minimum willfully and maliciously negligent towards your students. Their entire business model relies on manufacturing accused cheaters even where there are none in order to justify their continued expense and the paranoia of teachers. Every time it flagged one of my students' papers I would feed it something I made up on the spot and get flagged too.
      - Re: (Score:2)
        
        by Potor ( 658520 ) writes:
        
        If you find turnitin useful you're at absolutely minimum willfully and maliciously negligent towards your students. Their entire business model relies on manufacturing accused cheaters even where there are none in order to justify their continued expense and the paranoia of teachers. Every time it flagged one of my students' papers I would feed it something I made up on the spot and get flagged too.
        Did you see what I wrote? I use it primarily to catch students who recycle papers among one another.
        Also your anecdote is just that. If a student fails to use my essay template, and does not quote sources, Turnitin generally does not flag much if anything as plagiarism. I tell you that with years of experience.
        Also, if Turnitin flags something, you do know that it is within your agency to check if it is correct, and disregard it - non?
        Also, Turnitin accuses nobody of plagiarism. Instructors do, and hopeful
        
        Re: (Score:2)
        
        by Shadow of Eternity ( 795165 ) writes:
        
        So not only are you willfully and maliciously negligent in the duty owed towards your students, you also willfully and maliciously feign ignorance about the nature of turnitin's business model and how it's used by teachers.
        Also, talk to one of your colleague's in the English department about the appropriate use of "also".
    - Re: (Score:2)
      
      by hawk ( 1151 ) writes:
      
      I *want* to say that it was turnitin, but this was twenty years ago . . .
      I don't think it had much, if anything, in the way of options. For that matter, I'm not sure it had *any*, and it may well have been in beta.]
      And I *defintiely* caught things that it didn't. Not (quite) as flagrant as a friend who found "elsewhere in this issue", but still.
      Generally, I tossed suspect phrases into google with quotes around them, and *bang!*
      - Re: (Score:2)
        
        by Potor ( 658520 ) writes:
        
        Generally, I tossed suspect phrases into google with quotes around them, and *bang!*
        My trick was to google sentences with well-used semicolons, a skill American undergrads in general have never mastered.
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      LLMs provide a different challenge, but so far I have done alright in detecting them. The thing is, I can grade these as I want, without calling out the students. LLM papers are well written but with repetitive grammatical structures, start with a very clear thesis statement, and use the phrase "delve into" quite a bit.
      Remember a few months ago reading LLM generated sci-fi stories and couldn't get over how much generic phrases akin to "unlike anything we have encountered" appeared. Then I re-watched the first few seasons of STNG.. Started getting on my nerves to see the same language used so often.
    - Textbook planned obsolescence treadmill (Score:2)
      
      by tepples ( 727027 ) writes:
      
      However, they are also superficial, and rarely quote the authors I require. And if they do, they don't quote the right editions.
      Who pays for the right editions? There's a long-term swindle in the United States textbook market in which publishers seek an excuse to publish a new edition of a nonfree textbook solely to deter resale of used copies of old editions.[1] The usual tactic to render old editions useless is to reformat the text a bit, so as to invalidate page numbers, and reorder the exercises.
      [1] Samuel T. Loch and Joshua D. Van Mater. "The Efficacy of Planned Obsolescence Strategies in the College Textbook Market" [unc.edu]. Universit
      - Re: (Score:2)
        
        by Potor ( 658520 ) writes:
        
        That's hilarious. You have no idea what I teach, and then imply that I am part of the textbook swindle. My 100 level courses pay about $15 USD for texts, and I provide a few for free. My up-level courses come to about $30-50 USD, depending on the course.
        But you hit on an interesting point: it's not the administrators and loan guarantees that have caused the price of university education to sky-rocket; no, it's those evil professors and their demand for standard and usable editions.
- Re: (Score:2)
  
  by suutar ( 1860506 ) writes:
  
  When I saw the headline, my first thought was that "unprecedented success" is a pretty low bar in this field.
- Re: (Score:2)
  
  by Shadow of Eternity ( 795165 ) writes:
  
  That's a feature, not a bug. This and other frauds like "turnitin" both fundamentally rely on the boogeyman always existing. If you don't catch enough cheaters you need to manufacture them.
- Re: (Score:2)
  
  by stikves ( 127823 ) writes:
  
  What is worse?
  It is very likely to punish better works. The GPT training depended on high quality writing, and admittedly it shows in the output. But who else write high quality English with proper grammar and large vocabulary? Those who are actually good in their field.
  Those who write sloppily, and make a lot of mistakes will be spared.
  And those otherwise A+ students will have to take on legal fights to prove innocence.
  Just to be complete, the GPT "conversion" of my own writing (fortunately I won't be mist
Spot the Bot (Score:2)

by Press2ToContinue ( 2424598 ) writes:

Good thing they didn't test it on code comments or half the open-source community would be flagged. Now, if only we had a detector for spotting genuine human errors in academic papers.
Training against an adversary. (Score:2)

by MIPSPro ( 10156657 ) writes:

Isn't that how many of the LLMs are trained in the first place? They use an "adversary" [wikipedia.org] to spot errors or point out inconsistencies. The AI then has to regenerate the response to pass muster.
- Re: (Score:2)
  
  by canajin56 ( 660655 ) writes:
  
  No, as far as I know none of them are. The big players are all Transformer models of some variety. Now I suppose you could have negative examples where during training you flip the signs during backpropagation (you're trying to make the tokens less probable rather than more) but I don't know that that would work. You'd be making everything about it less likely, including grammatical correctness etc. Now what you're talking about with regeneration, that might work. If you have a discriminator that can say "T
ML detecting AI ?!? (Score:3)

by LordHighExecutioner ( 4245243 ) writes: on Monday November 06, 2023 @03:22PM (#63985170)

"When a shepherd goes to kill a wolf, and takes his dog to see the sport, he should take care to avoid mistakes. The dog has certain relationships to the wolf the shepherd may have forgotten." R. M. Pirsig, Zen and the Art of Motorcycle Maintenance

Write an AI to defeat the AI detector (Score:2)

by RogueWarrior65 ( 678876 ) writes:

That's going to be the next thing: an AI that will alter your paper so it passes the AI detector AI.
AIIIIIEEEEEEE!!!
Useless (Score:2)

by Bahbus ( 1180627 ) writes:

It doesn't matter if you can detect AI generated text 100% of the time if you also falsely catch ANY completely human text. Doesn't matter if they get it down to 1% false positives - any false positive makes this tool 100% unreliable. Not to mention that who care if there was an AI that wrote it as long as it is accurate? Because we have to give some specific person credit? Fuck them. Fuck their credit. Fuck people's entitledness. If it's true, accuracte, and real, then move the fuck on.
I feel sorry for all the students (Score:2)

by WaffleMonster ( 969671 ) writes:

Accused of all manner of cheating by fundamentally flawed snake oil.
So not much better then educated guessing... (Score:2)

by JasterBobaMereel ( 1102861 ) writes:

False positive rate is all you need to pay attention to .. this indicates how many people will be falsely accused - their paper scored at 0%, their qualification refused - when they did nothing wrong ... ...
Arms Race (Score:1)

by ddavisie ( 8163550 ) writes:

What happens when ChatGPT is trained against detectors to avoid detection?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Download any model you want... (Score:4)

Re: (Score:3)

false gods (Score:2)

Re: (Score:2)

False positives - 21% (Score:5, Informative)

Re:False positives - 21% (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

some things never change (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Textbook planned obsolescence treadmill (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Spot the Bot (Score:2)

Training against an adversary. (Score:2)

Re: (Score:2)

ML detecting AI ?!? (Score:3)

Write an AI to defeat the AI detector (Score:2)

Useless (Score:2)

I feel sorry for all the students (Score:2)

So not much better then educated guessing... (Score:2)

Arms Race (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals