Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Science

How Big is Science's Fake-Paper Problem? 127

The scientific literature is polluted with fake manuscripts churned out by paper mills -- businesses that sell bogus work and authorships to researchers who need journal publications for their CVs. But just how large is this paper-mill problem? From a report: An unpublished analysis shared with Nature suggests that over the past two decades, more than 400,000 research articles have been published that show strong textual similarities to known studies produced by paper mills. Around 70,000 of these were published last year alone. The analysis estimates that 1.5-2% of all scientific papers published in 2022 closely resemble paper-mill works. Among biology and medicine papers, the rate rises to 3%.

Without individual investigations, it is impossible to know whether all of these papers are in fact products of paper mills. But the proportion -- a few per cent -- is a reasonable conservative estimate, says Adam Day, director of scholarly data-services company Clear Skies in London, who conducted the analysis using machine-learning software he developed called the Papermill Alarm. In September, a cross-publisher initiative called the STM Integrity Hub, which aims to help publishers combat fraudulent science, licensed a version of Day's software for its set of tools to detect potentially fabricated manuscripts.

Paper-mill studies are produced in large batches at speed, and they often follow specific templates, with the occasional word or image swapped. Day set his software to analyse the titles and abstracts of more than 48 million papers published since 2000, as listed in OpenAlex, a giant open index of research papers that launched last year, and to flag manuscripts with text that very closely matched known paper-mill works. These include both retracted articles and suspected paper-mill products spotted by research-integrity sleuths such as Elisabeth Bik, in California, and David Bimler (also known by the pseudonym Smut Clyde), in New Zealand.
This discussion has been archived. No new comments can be posted.

How Big is Science's Fake-Paper Problem?

Comments Filter:
  • Doesn't help (Score:5, Informative)

    by backslashdot ( 95548 ) on Wednesday November 08, 2023 @12:51PM (#63990181)

    It doesn't help that top science journals repeatedly accept papers from known frauds, who then raise millions of dollars based on it. For example they retracted false superconductivity claims from the same author multiple times.

    2022: https://science.slashdot.org/s... [slashdot.org]

    AND THEN yesterday:

    2023: https://science.slashdot.org/s... [slashdot.org]

    Note ..in 2017, the same dude also made a claim that they made metallic hydrogen and "lost the sample" and still got published in Science .. that paper is NOT retracted.

    • by CEC-P ( 10248912 ) on Wednesday November 08, 2023 @01:09PM (#63990243)
      Hey, I read a paper once that clearly proved that generating hype and generating superconducting materials at room temperature are basically the same thing.
      • And this account is a sock-puppet of CEC-not(P)'s account?

        (Q.V. the mathematician's "P=not(P)" conjecture - which I think is still awaiting proof, or disproof. Nothing "personal".)

    • Re:Doesn't help (Score:4, Insightful)

      by jacks smirking reven ( 909048 ) on Wednesday November 08, 2023 @01:21PM (#63990281)

      Would this not be some evidence that the system does work itself out though? Like these papers were reviewed and then rejected and retracted, isn't that what is supposed to happen without a central authority system, which is not something we would want?

      • Re:Doesn't help (Score:5, Insightful)

        by Ed Tice ( 3732157 ) on Wednesday November 08, 2023 @04:31PM (#63990913)
        The papers were reviewed, accepted, disseminated to a large audience with a stamp of credibility, and then later retracted. They were never rejected. But those ones aren't really the problem. Because everybody knows they are fake. The hard part is if you are trying to legitimately learn more about something and possible do original research, the first step is to figure out what's already known. And then try to build on it. But if the papers you find are fake, you'll end up spending time and effort on investigations with no hope of achieving anything due to false premises.

        Honest mistakes are made all the time. The Wright Brothers were set back significantly due to some wrong airflow equations that they trusted because DaVinci came up with them and was very confident in the results.

        If the academic journals are going to be flooded with fake papers, it makes any further progress in those fields very difficult.

        • Re:Doesn't help (Score:4, Informative)

          by ceoyoyo ( 59147 ) on Wednesday November 08, 2023 @06:33PM (#63991391)

          Academic journals aren't supposed to be oracles of truth. If you're trying to learn about a subject, get a (reputable) textbook. Once you get beyond that you need to read the literature. Not a paper or two, but the literature. If you find a result in one paper without any confirmation, it's suspicious. The rate of fraud is minuscule compared to the rate of just being wrong.

          Journals are supposed to disseminate interesting stuff in a timely manner, and it's critical they do so that people actually working in the field can keep up. When stuff gets nailed down enough that it's pretty reliable, it makes it into a textbook.

          Different sources for different purposes.

          • by jythie ( 914043 )
            Yeah, I've noticed the idea of 'published papers' has taken on a rather significant mythology in the public's mind, so people get really worked up over the authority they think it wields.
            • by ceoyoyo ( 59147 )

              Yes, and it's a problem. Science relies on efficient communication, and the public getting all excitable about things they don't understand is almost always detrimental to that. The replies here range from the death penalty for anyone who turns out to be wrong on down to a mere beating.

              Most of the paper mill papers are confined to paper mill journals. Scientists know which ones those are, and you're never going to get rid of them because anybody can set up a web page that serves PDFs.

              Unfortunately, the publ

          • by Potor ( 658520 )
            I always thought that the article was the currency of promotion in science, like the book is in the arts. Is that wrong?
            • by ceoyoyo ( 59147 )

              No, that's correct. Science is about trying things and reporting your results, so other people can test them. It's not about being right. Publishing should be an important part of assessment in science.

              Unfortunately, administrators rarely have the time or knowledge to assess the quality of those articles so they usually just count them. That introduces quite a bit of incentive to try and publish things you know are crap. At the same time, the push for "open" journals has led to thousands of predatory operat

              • Quality can't easily be assessed by just administrators but long-term the number of citations is a reasonable metric. However, it can lag significantly although if you publish a new paper and it gets over 100 citations in a few months you might want to make space on your mantelpiece for something shiny.
                • by ceoyoyo ( 59147 )

                  Ah, the paper mills have that figured. They cite each other like mad.

                  Metrics are pretty much all gameable, especially the easy cheap ones.

                  • by q_e_t ( 5104099 )
                    Measures of citation quality can be used based on journal quality, although such judgements would be subjective and controversial. It's very hard to measure citation quality by the use of it in the paper. Where it gets trickier is if the citation is "whatever you do, don't do what this lot did [12]". You could argue that demonstrating that there is a way not to do something is actually a good contribution to science and there are also good arguments that clinical trials that show no efficacy for a drug for
                    • by ceoyoyo ( 59147 )

                      Sure, but you're just back at peer review. You can review the profs or you can review the journals, but if you want to do it right you need reviewers who know what they're doing, will take the time to do it right, and who's motivations align with yours.

                      I saw a post a while ago pointing out that science used to be community, but now it's an industry. And it has all of the perverse incentives that come when you confuse those two things.

                      Journal impact factors are actually based on how many citations the papers

      • Re:Doesn't help (Score:4, Insightful)

        by martin-boundary ( 547041 ) on Wednesday November 08, 2023 @04:50PM (#63990977)
        That's not evidence of the system working itself out, it's actually evidence that the system has no memory. A functioning system learns from its mistakes, the history of prior rejections always needs to be taken into account. This guy should basically be finding it harder and harder to publish anything over time (in a functioning system).
      • Youâ(TM)re not even supposed to get to publishing stage with peer review supposedly happening before/during publication. The reality is however that it is a who knows who and both credentialing (eg. content from Harvard vs some public college, only one is getting pushed with minimal review) and the informal social credit score (what is your level of perceived oppression) is the only thing that counts these days. I have been in academia going on 20 years, and the content across the board is getting wors

    • Re:Doesn't help (Score:5, Interesting)

      by groobly ( 6155920 ) on Wednesday November 08, 2023 @01:46PM (#63990393)

      There are basically 2 "top science" journals: Science and Nature. They are not specialty journals, but rather cherry pick the most "impactful" results. Maybe, maybe, Nature can be trusted with abstruse biology results. Other than that, I would trust neither journal, as the choice of what to publish is essentially political, not scientific.

      Actual top specialty journals do not have this kind of problem at anywhere near the same level. When was the last time a paper in JACM had to be retracted for example?

    • Re:Doesn't help (Score:4, Interesting)

      by gtall ( 79522 ) on Wednesday November 08, 2023 @02:17PM (#63990521)

      Science is not a top science journal. If you want to see top science journals, you have to go to journals that specialize in a particular field, such as physics or chemistry and even then you must go to specialized subareas. Getting a paper in Science is not going to impress anyone in the actual sciences.

      • Re:Doesn't help (Score:4, Insightful)

        by Phillip2 ( 203612 ) on Wednesday November 08, 2023 @03:34PM (#63990749)

        This is not really true. A "Science" or "Nature" paper is still considered to be an extremely good thing. There are a few areas where this isn't true -- I think that neither of them publish mathematics, per se. But, most people would consider these journals to be a "better" publication than most subject specific publications. In this sense, nature and science remain fairly unique.

        Whether this is a good thing or not is a different question.

    • Calling these companies journals is a stretch. They exist solely to make a buck on papers that researchers will often send you for free.

  • by MpVpRb ( 1423381 ) on Wednesday November 08, 2023 @12:53PM (#63990185)

    .. when quotas or something like quotas are used
    If a researcher is required by the system to produce papers, they produce papers. There seems to be no requirement to produce quality papers.
    It reminds me of a story I once read about the USSR. Steel mills were required to meet a quota that was measured in tons of steel, so they made heavy steel. Not good steel, not strong steel, heavy steel

  • Why do people study fake, clickbait, trash science? Because their field of study is worthless but they know it will get grant money from naive people. This fake paper problem is just another layer on top of the charade. The problem will stop just as soon as dumb rich people who don't know science stop giving grant money to scientists who do know science but misrepresent it for money and attention.
    • Who determines which science is "fake, clickbait" and what is valid? Who is the science authority figure in this scenario?

      • For papers, it's the reviewers. The quality of the review determines the quality of the journal. I think this research would be much more interesting if they rated the frequency of phony papers in individual journals and then correlated that to impact factor.
  • by Anonymous Coward

    Paper mills are not the core issue. The problem is our current academic model where the number of papers published is important.

    Even without paper mills, colleagues of mine publish nearly exactly the same paper, with permuted author list, across multiple journals. It poisons the literature.

    I have higher standards, and don't publish that sort of regurgitation. Although my career has suffered as a result, my reputation in the field is strong, and stronger than theirs.

    • by ceoyoyo ( 59147 )

      I was part of a large collaboration. The bosses required that all the rest of the bosses be "invited" to be authors on every paper. The answers gave an instant reflection of the quality of scientist. One PI I respect a lot would always say "no thanks, I didn't contribute." Another would always insist on being an author.

      The latter was in a meeting one time with a visiting researcher and hyping up his work, telling us all we should be implementing his technique right away. The visitor looked a little confused

  • by Opportunist ( 166417 ) on Wednesday November 08, 2023 @01:41PM (#63990369)

    The quality of the papers you publish should matter, not the quantity. I could see some kind of ELO rating for science, where papers that get referenced and republished give you a positive standing while papers that get refuted and that you had to retract will cause a serious hit to your "science rating".

    I'd guess that would clear this up pretty fucking quickly.

    • by Pascoea ( 968200 ) on Wednesday November 08, 2023 @02:03PM (#63990445)
      I would tend to agree, that some sort of rating/karma system would be great. As long as you can figure out how to get past the same problems we have here: People tend to treat the moderation as "-1: Disagree", and (occasionally inappropriately) tank a poster's rating/rep just because they disagree with their position.
      • Disagree with me, I got Karma to burn.

        The problem with "unpopular" opinions doesn't really apply with science, because you would probably not even get the grant money for your research to take off if nobody is interested in your research.

        • by Pascoea ( 968200 )

          Disagree with me, I got Karma to burn.

          Same here. But that's because I generally don't make a lot of waves on here, and try (sometimes successfully) to not be an asshole.

          The problem with "unpopular" opinions doesn't really apply with science

          This I disagree with. Take any controversial topic and you can find valid credible science coming to differing conclusions. Or new science that "goes against" the current best knowledge. Or just valid science that some entrenched interest with power to wield doesn't like it. I'd hate to see those opinions "silenced" just because someone doesn't like what they discovered. As f

      • by UpnAtom ( 551727 )

        The US political system has created massive division. Science is much less divided.
        We can have written critiques with reasoning. Moderators would need to be scientists. I think something would have to stop Bad Pharma dominating -- maybe just exclude anyone working for a pharmaceutical company.

        Could run the system in parallel, see if it works.

    • The quality of the papers you publish should matter, not the quantity. I could see some kind of ELO rating for science, where papers that get referenced and republished give you a positive standing

      That's the publication citation index.

      A slightly more sophisticated rating method is the H index. A H of, say, 9, means that you have published 9 papers that have each been cited 9 (or more) times.

      • by reg ( 5428 )

        The problem with this is that there is no tracking of reasons for citation, like +1,-1 here. If you get a junk paper published, it will often be highly cited, all saying you're an idiot... Some fields in the social sciences use a "for/against" style in their citations, but the h-index does not capture those. A lot of very highly cited papers are also "method" papers. If ASTM/IEEE standards had an h-index, it would be huge and meaningless. And then there's the issue of missing citations (very convenient

      • by andi75 ( 84413 )

        > That's the publication citation index.

        That is a terrible system because it is easily gamed. A group of researchers get together, agreeing to cite (for no particular reason) a certain amount of papers from the other members, and they in turn will cite your paper, inflating everyone's citation index by a large number. H-index suffers from the same problem.

  • Now I kind of want to publish a vanity research paper, in which I'd make some ridiculous physics claims that sound just good enough to fool the lay person, and then cite my published paper in random arguments. /The arguments wouldn't even be about physics.

  • That's been the mantra of university since forever.

    • And, to be honest, you can see why the alternative is unacceptable. No one will ever be willing to pay you to do a job with no expectation of ever having any work output.

          That effectively means that "scientist" would not be a career and science would only ever be performed by amateurs in other job, say, patent clerk.

  • I wish language was less confusing or that people would specify things more clearly.

    • The term has been in American usage for decades. Derision by conflating a smelly bulk industrial process with a source of academically fraudulent papers, is the entire point. Even in the early 1970s, there were several "research service" storefronts near my university, that sold term papers from filing cabinets. Casual conversation frequently referred to these as "paper mills".
  • Do "publish or perish", get tons of crappy papers and some outright fakes. What do you expect? I would expect frigging scientists to understand that. But apparently not.

    • I think this phenomenon shows that they understand it perfectly. They know that "perish" puts them at a McDonalds drive-through asking if you want fries with that.

    • by hey! ( 33014 )

      Potboiler papers are mainly a problem for the institutions that judge researchers by a crude publication quantity metric. They break an already broken evaluation scheme for researcher productivity. But otherwise, they sink quietly into obscurity doing little harm. Dishonest papers are a different kettle of fish.

      It's probably worth keeping in mind that many perfectly honest papers are wrong for completely innocuous reasons. Researcher fallibility is always the first thing everyone assumes when someone ann

      • Worldwide online access has brought another dimension to the problem. Several years ago I was looking for information on Single Event Upset ("cosmic ray" disruption of electronics). I searched and located one paper that looked intriguing, because it appeared to report results from a device from the same chip manufacturer and not-too-far-off process generation as I was working with. The "journal" and university were unfamiliar and located in India. Reading the paper: nothing fit, apart from a photograph
  • Goodhart's Law (Score:5, Insightful)

    by John Allsup ( 987 ) <<ten.euqsilahc> <ta> <todhsals>> on Wednesday November 08, 2023 @02:21PM (#63990533) Homepage Journal

    Quantity of publication and reputation of journals have become targets which researchers are forced to aim at.

    "When a measure becomes a target, it ceases to be a good measure."

    Honest researchers are outcompeted by frauds (or at least less honest researchers) when it comes to these measures.

  • by OldMugwump ( 4760237 ) on Wednesday November 08, 2023 @03:23PM (#63990723) Homepage

    Can journals not create a shared blacklist of authors who've been proven to do this?

    Seems to me that would be a pretty effective deterrent.

    (at least the corresponding authors; co-authors may not have even known their names were used)

  • by bradley13 ( 1118935 ) on Wednesday November 08, 2023 @04:06PM (#63990841) Homepage

    Publish-or-perish at its best. I'm a prof for a state teaching college. If we do "research", it is almost always a practical project with industry, usually as a basis for student projects.

    A few years ago, upper management decided we needed to be accredited (beyond the state accreditation that we have to maintain). One of the main requirements is...you guessed it, publications. Idiotic for us.

    Cynically, the accreditation did achieve an important goal: the massive expansion of the school administration required to manage it.

  • It would just take association with one of these bullshit papers to tank one's scientific career.
  • Yeah. Right.

  • Oh wait, these journals are already run by a bunch of AI bots.

Make sure your code does nothing gracefully.

Working...