
Springer Nature Book on Machine Learning is Full of Made-Up Citations (retractionwatch.com) 53
Springer Nature published a $169 machine learning textbook in April containing citations that appear to be largely fabricated, according to an investigation by Retraction Watch. The site checked 18 of the 46 citations in "Mastering Machine Learning: From Basics to Advanced" by Govindakumar Madhavan and found two-thirds either did not exist or contained substantial errors.
Three researchers contacted by Retraction Watch confirmed their supposedly authored works were fake or incorrectly cited. Yehuda Dar of Ben-Gurion University said a paper cited as appearing in IEEE Signal Processing Magazine was actually an unpublished arXiv preprint. Aaron Courville of Universite de Montreal confirmed he was cited for sections of his "Deep Learning" book that "doesn't seem to exist."
The pattern of nonexistent citations matches known hallmarks of large language model-generated text. Madhavan did not answer whether he used AI to generate the book's content. The book contains no AI disclosure despite Springer Nature policies requiring authors to declare AI use beyond basic copy editing.
Three researchers contacted by Retraction Watch confirmed their supposedly authored works were fake or incorrectly cited. Yehuda Dar of Ben-Gurion University said a paper cited as appearing in IEEE Signal Processing Magazine was actually an unpublished arXiv preprint. Aaron Courville of Universite de Montreal confirmed he was cited for sections of his "Deep Learning" book that "doesn't seem to exist."
The pattern of nonexistent citations matches known hallmarks of large language model-generated text. Madhavan did not answer whether he used AI to generate the book's content. The book contains no AI disclosure despite Springer Nature policies requiring authors to declare AI use beyond basic copy editing.
"Better Crap", what do you expect? (Score:5, Insightful)
The fanbois are blind to it, the rest are shaking their heads in disgust. And the people like this "author" are essentially scamming their readers.
Re: (Score:1)
Re: (Score:3)
Another one using ChatGPT to write book is nothing new.
I would say it is "the new normal." Including the bit where AI use is not disclosed even though disclosure is mandatory.
Re: (Score:3)
Most publishers have been doing nothing for the object published and essentially just skimming money for ages.
Re: (Score:2)
Re: (Score:2)
Yep, same experience here when I published my PhD. These people are utterly disgusting.
Re: "Better Crap", what do you expect? (Score:2)
The Springer name (and price tag) used to carry weight. Not so much any more.
Re: (Score:3)
And Springer too of course. They'll turn a blind eye any chance they get.
Re: (Score:3)
When I published my PhD in 2008, Springer was an option. After looking at them, I was so disgusted that I went to considerable effort to find a small publisher that let me keep the online copyright of my thesis and only took the paper-rights for two years. Oh, and Springer and the like do absolutely nothing on the content-side for you. They just print and sell, everything else is on you.
Unfortunately... (Score:3)
... the media and general public are believing all the marketing puff and outright lies that the AI industry is churning out such as AGI is just around the corner along with full self driving cars etc etc. Anyone with a clue knows this is total BS but the pump and dump continues.
Re: (Score:2)
Indeed. The general public is not smart and easy to manipulate. This is my core learning for this decade, before I was naïve (or maybe sheltered, if you can be sheltered in your 40's) and thought the average person was somewhat smart and had some insight into how things work. Not anymore.
As to full self-driving, we have SAE 4 now, with geographic limitations. That is after about 60 years or so of steady research. Give it another 20 or 30 years and SAE 5 may become a reality, probably still with some li
Re: (Score:2)
As an aside, I find it interesting that the noun "learning" is being used a lot these days. I see it all over the place now, substituted where previously the noun "lesson" would normally be used. It would be interesting to trace the etymology in current usage. Is it a reaction to the fake "learning" of AI (iterative convergence claims without real convergence)? Do people want to re-appropriate the word for themselves? Do people want to signify that they can "l
Re: (Score:2)
People were complaining about it being a corporate usage [upenn.edu] 15 years ago, so it's nothing to do with LLMs or TikTok.
Re: (Score:2)
Re: (Score:2)
My take is the difference is that lesson is more externally provided/imposed, while learning is more of an internal (and voluntary) process.
Right, AI invented fraud (Score:3)
Re: (Score:2)
Your subject bears no relation to the post to which you're replying, bot.
Re: (Score:2)
First rule of Liars' Club is: It's Honesty Club. (Score:5, Interesting)
Re:First rule of Liars' Club is: It's Honesty Club (Score:4, Interesting)
People absolutely are doing stuff like that. They are even putting LLMs in front of other LLMs to act as WAF-like firewall solutions and such.
The problem is it is all very compute and memory intensive. I do see some people getting good results tying multiple models together via MCP and other interop solutions, in terms of outputs. The problem of course is you get an application that is painfully slow to use and to expensive to run.
Its funny the big tech people use to talk about doing things more efficiently so they don't 'boil the ocean' now its all what problems can we throw LLMs at, no matter how inefficient that might be from a required compute perspective and now its how fast can we put up datacenters and restart old power plants ... seemingly with little thought at all when it comes to boiling oceans... Until tax season anyway where they invent some scam accounting game to show how carbon neutral they are..
Re: (Score:3)
I don't understand why fans of LLMs don't simply use a second LLM to fact-check the output of the first one. Though I suppose for that to work the second LLM would need to formally recognize that some sources of truth are better than others, which would strike a killing blow to the heart of the LLM ethos. And then the first LLM would need to rewrite its original draft based on the editorial input of the second one, which would undercut its unmerited bloviating confidence, which would strike a second killing blow to the heart of the LLM ethos.
It's far simpler than that. Some of these references just don't exist. Just ask a summer intern or high school student to write a simple script to check for the existence of these references. Of course, since this appears to be so challenging to do, maybe someone can form a startup to address this problem and earn billions of dollars.
In real-life editing, there are human editors that just check for grammar, formatting, etc. Then there are editors that check for consistency, legal issues, being on messag
Re: (Score:1)
Re: (Score:3)
Re: (Score:2)
This is the Maxwell Smart approach.
Chief: I don't know, Larabee. Maybe I should have gone with Max's plan.
Larabee: What plan was that, Chief?
Chief: The one where we use 99 Control agents, so that if 98 of them get taken out...
Re: (Score:2)
Because that obviously does not work.
You have to fact check yourself or ask one who trustworthy does it
His LinkedIn bio includes: AI Ethics (Score:5, Informative)
His bio is fascinating.
BTech in Chemical Engineering.
Master level practitioner of Neuro Linguistic Programming.
Never having heard of Neuro Linguistic Programming before, I looked it up. Wikipedia calls it a pseudoscience.
But the AI Ethics expertise takes the cake.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
You never heard of it: It was the new, wizz-bang communication tool about 10 years ago. It's a fancy way of saying, word association: That and repetition are how we remember stuff. Whether word association results in a Pavlovian response is, obviously, fuzzy. The fakeness comes from assuming that the word association I make, causes a Pavlovian response in someone else. A bit like assuming my swelling penis causes panties to drop: We all want easy answers and NLP promises them.
$169.00 for this book of lies (Score:2)
Par for the course (Score:4, Insightful)
The whole AI is shit, why shouldn't the "books" on it be shit too.
Re: (Score:3)
Re: (Score:3)
Yeah, right around the time I released my first "real" book, it became impossible to get a legitimate review because the AI books were starting to flood the market and reviewers you used to be able to send free copies to for a fair assessment were closing down their intake because they were being absolutely swamped with books so shitty they couldn't bring themselves to finish them. Not sure how one goes about building a base with no review channels available, I've since decided I'll just finish the books I
Re: (Score:2)
As a reader of science fiction, I'm stuck reading stuff written before the pandemic... it's like the whole expressive-art-form just ended with chatgpt. When 99.999% of the new-stuff-published is AI generated, it might as well be 100% and is safe to completely ignore.
I haven't figure out what to do regarding text-books yet. Can't just ignore those... and yet I'm sure a large fraction of them will be AI generated in the next decade.
It sucks that publishing houses thought they could cut creators out of the whole process by using AI/LLM generated books, so now creators can't get published, and readers can't find decent new books to read. It's almost as if the big publishers don't understand either their audience, or their creative talent. Or they just have a death wish on the business side.
Meanwhile over in Washington... (Score:3)
... the trump administration has just stopped all subscriptions to Springer Nature publications citing junk science in their journals (hopefully not Nature itself) which seems to have got a lot of people hot under the collar.
I'm no trump fan but Trump et al have long been compared to a broken clock - occasionally correct if you wait long enough. Maybe this is one of those times.
See below (Score:2)
Re: (Score:2)
His idea
Mastering Machine Learning, a review (Score:5, Funny)
-- Winston Churchill
Re: (Score:3)
Oh come on Aaron, it's not like someone can possibly be aware of *all* the books they have written.
Re: (Score:3)
Singularity via Human neural network (Score:2)
This is kind of funny, but if you assume that this book was written by AI, which is being used to train humans on it's hallucinations to program / improve AI, this almost becomes a link to singularity, with humans brains as part of the neural net.
Re: (Score:2)
This is kind of funny, but if you assume that this book was written by AI, which is being used to train humans on it's hallucinations to program / improve AI, this almost becomes a link to singularity, with humans brains as part of the neural net.
That would be a hive mind, not the singularity, because at no point will it become effectively infinitely intelligent. Instead, the stupidities multiply. It's literally the opposite of a singularity.
He didn't write this (Score:2)
Re: He didn't write this (Score:2)
He used machine learning to write a book about machine learning.
No minions required.
Or maybe the minions used ChatGPT
Needed: a citation AI (Score:2)
LLMs are terrible at math. So when you ask Copilot a math question, it relies on a separate, math-focused AI to solve the problem. https://www.microsoft.com/en-u... [microsoft.com] If it didn't do this, it would have a really hard time solving math problems, as an LLM.
The same pattern applies to source citations. The LLM generates text that notes citations at plausible locations within a document, but then doesn't link them to anything meaningful. Perhaps a separate AI, such as a web search engine, could locate the actual
It's part of the lesson (Score:2)
It's a text book about machine learning and LLM's. it's teaching you about made up shit that has the appearance of being legit.
It's what LLMs do best. Make random crap look like real crap.