Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Science

Study of Massive Preprint Archive Hints At the Geography of Plagiarism 53

sciencehabit writes with this excerpt from Science Insider: New analyses of the hundreds of thousands of technical manuscripts submitted to arXiv, the repository of digital preprint articles, are offering some intriguing insights into the consequences — and geography — of scientific plagiarism. It appears that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don't get cited much. The system attempts to rule out certain kinds of innocent copying: "It's a fairly sophisticated machine learning logistic classifier," says arXiv founder Paul Ginsparg, a physicist at Cornell University. "It has special ways of detecting block quotes, italicized text, text in quotation marks, as well statements of mathematical theorems, to avoid false positives."
This discussion has been archived. No new comments can be posted.

Study of Massive Preprint Archive Hints At the Geography of Plagiarism

Comments Filter:
  • The game is to find a unique angle to approach your research that's essentially clickbait, then produce some results, and figure out some way you can claim victory and go home.

    If you're just doing this to get on to the next stage, it makes sense to plagiarize and get it out of the way. You can get to the nice fat yearly income that way without having to know much of anything.

    Do we have a quality of scientists problem because science is such an esteemed (and often well-paid, in private practice at least) car

    • The game is to find a unique angle to approach your research that's essentially clickbait, then produce some results, and figure out some way you can claim victory and go home.

      If you're just doing this to get on to the next stage, it makes sense to plagiarize and get it out of the way. You can get to the nice fat yearly income that way without having to know much of anything.

      Do we have a quality of scientists problem because science is such an esteemed (and often well-paid, in private practice at least) career that people who should not be scientists are trying to be scientists?

      © CaptainDork November 16, 1960

      You bastard/bitch as applies.

  • For example, more than 20% (38 of 186) of authors who submitted papers from Bulgaria were flagged, more than eight times the proportion from New Zealand (five of 207). In Japan, about 6% (269 of 4759) of submitting authors were flagged, compared with over 15% (164 out of 1054) from Iran.

    I suspect that the ratio in countries where the motivation could -literally- be publish or perish, will be consistently higher than those where the saying is figurative.
    • I suspect that the ratio in countries where the motivation could -literally- be publish or perish, will be consistently higher than those where the saying is figurative.

      Interestingly, using the word 'perish' to mean, 'lose one's job' is also figurative.

  • I am wondering how many of the people who are flagged as plagiarizing in countries with a low rate, if they are originally from countries with a higher rates.

  • by Anonymous Coward on Thursday December 11, 2014 @01:14PM (#48573663)

    And, I have found that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don't get cited much.

    • And, I have found that copying text from other papers is more common in some nations than others, but the outcome is generally the same for authors who copy extensively: Their papers don't get cited much.

      Funny. That's exactly what TFA said.

      It's almost as if you plagiarized it.

  • about one in 16 arXiv authors were found to have copied long phrases and sentences from their own previously published work

    OK, sometimes quoting your own work may be legit, but this sounds more like simple boilerplate cut and paste

    • by starless ( 60879 )

      I work a lot with data from astronomy satellites. A lot of the first steps of the analysis, and describing the spacecraft
      and its instruments are very close to the same from paper to paper of mine. (And similarly for other people doing similar
      work.) This results in a lot of near (and sometimes exact) duplication of text. However, I believe this is still valid
      and necessary. The heart of the paper - i.e. the new results and conclusions - does still differ of course!

  • I don't have there whole data, but they did put up 10 or so on their nice little map. Seems more like the fewer papers a country has the higher the percentage of plagiarism. However, the US has so many papers in this study it should be divided into smaller regions.
  • Gaming the system (Score:2, Informative)

    by Anonymous Coward

    I wonder how much these disparities are due to western researchers knowing how to game the system. Some 10 years ago I received a warning related to "self-plagiarism" because I had copied the definition of a problem from one of my previous papers (one column, the rest of the paper was completely new). Since then, I know I have to change the text of the problem definition between two papers, even if it is the same. In the meantime, I have seen people submit the same work to two different conferences after ch

    • by tlhIngan ( 30335 )

      Some 10 years ago I received a warning related to "self-plagiarism" because I had copied the definition of a problem from one of my previous papers (one column, the rest of the paper was completely new). Since then, I know I have to change the text of the problem definition between two papers, even if it is the same.

      So why not just quote yourself then? I mean, self-plagiarism is just like plagiarism (except you're presenting existing ideas as new, rather than other's ideas as yours).

      Is it too hard to cite o

  • by phantomfive ( 622387 ) on Thursday December 11, 2014 @01:44PM (#48574011) Journal
    If you're going to plagiarize, don't upload your paper to arXiv.
  • by hey! ( 33014 ) on Thursday December 11, 2014 @02:45PM (#48574641) Homepage Journal

    Some countries place a high premium on memorizing and repeating back the teacher's words. These countries still produce their share of good and bad engineers, but they're sometimes bad in unrecognizable ways.

    I once hired a software engineer from a third world country who had an encyclopedic knowledge of design patterns. You could name any pattern in the GoF *Design Patterns* book and he could reel off the UML without hesitation and give a convincing sounding explanation of how the pattern worked. But when I started inspecting his code, I quickly realized he had no understanding of what any of it meant. It was just pictures and words he'd memorized, an impressive and prodigious feat, but ultimately useless to me.

    Now I should say I've hired some very good software engineers from this country; it's not that they don't make good engineers over there. For most people the discipline to absorb a lot of information yields many benefits. But this guy was an outlier; he managed to get a master's degree over there in a subject he had no practical understanding of whatsoever.

  • Breakdown by region doesn't mean anything. IMHE, India should be a top offender. Maybe the fails in the US are all from native Indians and Bulgarians....we don't know.
  • I saw this same exact post over on reddit yesterday, but it was posted by a different user ...
  • by Required Snark ( 1702878 ) on Thursday December 11, 2014 @08:46PM (#48578081)
    There is an intrinsic problem with the map presentation: it ignores the relative number of papers from each country. This can lead to a distorted perception for countries with a small number of papers in the data set.

    To quote the article "It shows only the incidence of flagged authors for the 57 nations with at least 100 submitted papers, to minimize distortion from small sample sizes." If a country has a total number of papers in the hundreds it implies the number of authors is also low. Therefor, a small number of authors who routinely plagiarize can have a major effect.

    It's analogous to a small town with a very low crime rate. All it takes is a few significant incidences to cause a huge jump in the statistics.

    For comparison, it would be interesting to see the rates for other kinds of text reuse. From the article:

    After filtering out review articles and legitimate quoting, about one in 16 arXiv authors were found to have copied long phrases and sentences from their own previously published work that add up to about the same amount of text as this entire article.

    For comparison it would be useful to see the percentage of this reuse displayed on another map. I have a strong suspicion that countries that look good on the presented map would not look nearly as good by this measure.

"Experience has proved that some people indeed know everything." -- Russell Baker

Working...