An Amateur Just Solved a 60-Year-Old Math Problem - by Asking AI (scientificamerican.com) 64
Slashdot reader joshuark writes: Scientific American reports that a ChatGPT AI has proved a conjecture with a method no human had developed. A 23-year-old student Liam Price just cracked a 60-year-old problem that world-class mathematicians have tried and failed to solve.
The new solution that Price got in response to a single prompt to GPT-5.4 Pro was posted on www.erdosproblems.com, a website devoted to the Erds problems. The question Price solved — or prompted ChatGPT to solve—concerns special sets of whole numbers, where no number in the set can be evenly divided by any other...
Price sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge. The duo had jump-started the AI-for-Erds craze late last year by prompting a free version of ChatGPT with open problems chosen at random from the Erds problems website. Reviewing Price's message, Barreto realized what they had was special, and experts whom he notified quickly took notice.
The new solution that Price got in response to a single prompt to GPT-5.4 Pro was posted on www.erdosproblems.com, a website devoted to the Erds problems. The question Price solved — or prompted ChatGPT to solve—concerns special sets of whole numbers, where no number in the set can be evenly divided by any other...
Price sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge. The duo had jump-started the AI-for-Erds craze late last year by prompting a free version of ChatGPT with open problems chosen at random from the Erds problems website. Reviewing Price's message, Barreto realized what they had was special, and experts whom he notified quickly took notice.
The new wave (Score:4, Insightful)
This is what a lot of people get wrong about "AI". The AI itself isn't the special thing on its own. It may not be real AGI but honestly you don't want that because then it wouldn't be a tool. It's the way it can wrap huge volumes of information in a way that is easily manipulated by a human. It's a search engine connected directly to your brain. With a little creativity and an AI you would be astounded as to what's possible with careful use.
It's not that different from any other work, just more advanced. Like all advancements you're simply working at a different level and that may require new skills.
The printing press->radio->TV->Internet->AI. This is the future. The world is changing.
Re: (Score:2)
Futures generations will look back at these times and how "simple" and "awesome" they were. You're part of something bigger and you have a front row seat.
Not sure future generations will look back at all (Score:3, Insightful)
I don't see a third option and I hope I'm dead before the worst of it.
Like there is a huge automation push going on right now it is almost guaranteed the cause of minimum 25% permanent unemployment and our civilization is still hung up on if you don't
Re: (Score:1)
Correct, those in charge seek to turn us all into mulch, they view people as property that they haven't yet stolen. They seek to replace you, deny you an ability to earn a wage, take everything you own, enslave you and have you starve to death homeless, voteless, and without rights or healthcare. This is capitalism in its purest form, only with a few people capable of taking everything with the aid of machines that we collectively pay for. And they are in league with religious freaks that seek to bring ab
That's not quite right (Score:1)
When somebody asks who is going to buy their products the Epstein class is completely aware of that dependency and they are working to use automation in order to break the dependency.
Right now some of us have a little bargaining power for wages and the ruling class doesn't want to pay wages at all. They don't want to have to do anything with filthy humans that are
All right the shitty llm is back (Score:1)
Re: All right the shitty llm is back (Score:1)
I think I can say for everyone here: stfu
Re: Not sure future generations will look back at (Score:1)
Don't be so defeatist. Even if it's all true, there's always a way towards a better world, despite the appearance that a cabal of malevolent monsters are locked into a global lattice of evil, having sex with children, hunting and eating them, committing genocide and engaging in wars, corrupting every institution which runs societies...
There's always a way to make turn it around.
Re: Not sure future generations will look back at (Score:1)
When Jesus returns (in response to the ItsNotRealies initiating the end times - global nuclear war) he will sort out the ItsNotRealis (any still alive) - this time he's probably prepared.
OR they're nuts and the whole thing is not going to end well.
Time will tell.
Levels of abstraction matter (Score:3)
[Overlooking the nameless BF.]
Most relevant recent citation is Stolen Focus by Johann Hari, but my use of "level of abstraction" goes back many years and the more modern label is probably "reference frame". Also related to contextual meaning.
Fundamental problem is the accumulation of too much information, so we have to attack problems by reframing them at the right level and by using the correct tools to manipulate them within that appropriate frame. Not at all surprised that someone who hasn't mastered a
Erds (Score:2)
Re: (Score:2)
Yeah, it's funny on its face, but when I asked a generative AI about "AI for Erds" and it hallucinated an answer involving ye olde Entity Relationship Diagrams.
Is there a new job category for people who are good at asking questions the generative AIs can't answer properly? I'm pretty sure my batting average against Gemini is way over .300. Combination of nasty questions and wording questions in ways that suggest I might be expecting a particular wrong answer. I'm also considering the possibility that the AI
Re: (Score:3)
Re: (Score:2)
Given sets of X articles for which Slashdot editors can screw up the summaries, the number of errors E(X) will never converge to zero for X>0.
Re: (Score:2)
Re: (Score:1, Troll)
Whilst you're almost certainly correct (AI would be unlikely to conquer a problem requiring any meaningful original thinking, even with help), this gives the aforementioned student an Erdos number (which is not quite as exciting as a Fields medal, but nothing to sneeze at either) and it's entirely possible that the conjecture will turn out to actually be useful in some area.
Re: (Score:2)
"AI would be unlikely to conquer a problem requiring any meaningful original thinking, even with help"
You have no reason to believe that, and I don't think it's true. The human brain does not work on magic, "original thinking" may be beyond us currently but there's no reason to think it will remain unsolved.
Re: (Score:1)
Ah, yes, the deranged claim that we know how the human mind works and it is purely mechanistic. You just excluded yourself from rational discussion by pushing a quasi-religious dogma with no supporting scientifically sound evidence.
Seriously, you "AI believers" are not one bit smarter than the Jesus-freaks.
Re: Just means none of the experts cared enough (Score:2)
They think that "intelligence" is nothing more than an emergent property of scale (of your neurons), and therefore scaling AI will inevitably and spontaneously result in AGI.
Re: (Score:2)
Yes, they apparently really think that. Makes me think they have no general intelligence themselves.
Re: (Score:2)
Neural nets are classifiers. They cannot create, they can only distinguish.
Re:Just means none of the experts cared enough (Score:5, Informative)
Re: (Score:2)
My Erdos number was something I felt put me in a very exclusive club (my field is CS, btw)! Only a select few ... wait a second. The whole thing about Erdos numbers is that they're the expression of how common those interconnections are :-)
Like the six degrees of Kevin Bacon, the net is huge!
Re: (Score:1)
Actually the last two generations of LLMs do reason.
Perhaps you should google how it works.
The internet is full with educational videos and texts how LLMs work, how agents are configured, and how their reasoning process is concrete implemented with concrete algorithms.
It is pretty astonishing, how many people who are into programming, disregard one of the biggest topics in computer science of the resent 30 years.
Re: (Score:2)
False.
They call them "reasoning models" but that is pure marketing. What a fucking joke.
Re: (Score:2)
They call them "reasoning models" but that is pure marketing.
The data doesn't agree with you. Looking at improvement in benchmarks, the reasoning models show faster improvement on benchmarks than non-reasoning models did. See https://epoch.ai/blog/have-ai-capabilities-accelerated [epoch.ai] for a summary. This is hard to explain as pure marketing.
Re: (Score:2)
You have been conned. It starts with careful creation of benchmarks. It continues with providing stunt after stunt and evermore grand claims. And you fell for it.
Re: Just means none of the experts cared enough (Score:2)
Re: (Score:2)
Re:Just means none of the experts cared enough (Score:5, Informative)
Re:Just means none of the experts cared enough (Score:4, Informative)
If you are a mathematician, you should be able to see the difference between "nobody cared enough" (my claim) and "no one cared" (your gross mis-statement of my claim).
Sigh. Why am I not surprised that you've responded this way. Let's be clear then: You can replace my comment I wrote earlier with the word "enough" added just at the end and everything I wrote would still be true. Your statement is just wrong, and it is wrong for exactly the reasons I outlined.
Re:Just means none of the experts cared enough (Score:5, Insightful)
It is fascinating which people get math degrees these days. Apparently logical thinking is not a requirement anymore.
"Logical thinking" is not synonymous with "agrees with gweihir" as much as you would like them to be.
Ok, let me spell it out for you: The information was out there and could be combined in a purely mechanical, no-insight-required way to provide the answer. Nobody cared enough to find it and try that. Is that clear enough or are you still bereft of understanding?
What you are spelling out and claiming here is just not true. The approach the AI took was *different* than the approach that the literature did. You don't need just my opinion on this. Terry Tao (who is a Fields Medalist) and Jared Lichtman (who is one of the best of the young new number theorists who had previously published work on this and related problems) both disagree with you https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/ [scientificamerican.com]. If the people who have looked at this in detail and are subject matter experts disagree with you, maybe you it should, possibly occur to you that are wrong here?
Re: (Score:2)
"Logical thinking" is not synonymous with "agrees with gweihir" as much as you would like them to be.
If the people who have looked at this in detail and are subject matter experts disagree with you, maybe you it should, possibly occur to you that are wrong here?
I had an interesting exchange with Gweihir recently. I don't want to put words in his mouth, but he seems to be of the opinion that it is not proven that the human brain is responsible for human intelligence, that there is something special about human intelligence (that he also believes cannot be simulated or reproduced), and that understanding human intelligence is potentially impossible.
He says "Ah, yes, the deranged claim that we know how the human mind works and it is purely mechanistic. You just exclu
Re: (Score:1)
I merely state the actual scientific state of the art. And that is that we have no clue how the human mind works. Believing it is all just known (!) Physics at work is called "Physicalism" and it is a quasi-religious belief, not Science. But a lot of people have trouble with unknowns and hence make it either-or. As you just did and which I did not.
So let me repeat, if you insist that the human mind is based on purely known physical effects, please provide scientifically sound evidence for it. I am still wai
Re: (Score:3)
I have never seen anyone on slashdot claim that "it is all just known" when it comes to human intelligence or the brain. Literally, never. It is also absolutely incorrect to say that we have no clue how the human--we have many clues. I have said repeatedly that we don't know it all. What we do know is that humans exist, and humans have human intelligence.
I posited earlier that if it exists, it can be built.
If you want to change the conversation to a metaphysical conversation about the nature of reality and
Re: (Score:1)
So, I am not a mathematician, I am a CS PhD. It looks to me as if you have no clue how an LLM works and what it can and cannot do. And some others (the ones you quote) seem to be subject to the same limitations. Note that even a Fields Medal does not prevent you from cluelessly shooting your mouth off about a topic you are not an expert in.
Re: (Score:2)
Hey, I'm a CS PhD too! And in a relevant field even. Any of the recent "LLMs" aren't LLMs. They contain them as components, among other things.
Not having one doesn't either, evidently.
Re: (Score:2)
Re: (Score:2)
How easy is it to back-track through the model to find the main source(s) of the answer?
No, that's not how gen AI works (Score:1)
But I don't have the math skills to explain properly, so I'll resort to a kind of metaphor. The real "talent" of AIs is in sounding plausible based on lots of examples of "what sounds good". In this case, it turned out that Occam's Razor worked, and an explanation that sounded plausible turned out to be a valid proof.
But I speculate there were many failures, some of them hilarious, though the human "author" only published the good guess.
Re: (Score:2)
There are no "main source(s) of the answer", so not easy at all. However, perhaps you could ask the AI how the answer was generated.
The human mind does not arrive at answers purely by applying previously learned answers, neither does generative AI. AI makes predictions based on COLLECTIVE previous experience, not with boolean comparisons to collections of "main sources". AI remembers no "main sources" at all, it can, however, predict what they were with remarkable accuracy when asked to do so. This is w
Re:Ahem (Score:4, Informative)
Re: (Score:3)
Mathematician here. I doubt that "we" would have found an existing solution (or elements of it) on the internet if it exists. The Internet is vast. You just won't believe how vastly, hugely, mind bogglingly big it is (to paraphrase someone). And working mathematicians are busy enough not to go looking in unlikely places.
I know from experience that solving problems opens a vast fan-out of possible approaches, recursively. Most often, the issue with problems isn't finding all the ways forward and picking the
Re: (Score:3)
We know from Terry Tao's interest in AI that previous attempts at claiming successful AI-generated novelty with Erdos problems were wrong.
Two of the previous examples turned out to be in the literature. We've had other examples where this was not the case. This example is more noteworthy not because it is the first where this is the case but because it is the first substantial one. (I think of Erdos problems as coming in three categories. First, pretty obscure ones which almost no one has heard of. Second, ones where subject matter experts have heard of them even if they aren't that famous. Third, thinks like the Erdos-Straus conjecture which
Re: (Score:2)
Except that is false.
We know (Score:2)
Somebody already told us.
Unfortunately we still don't have an anti-dupe AI.
"Do you like apples?" (Score:1)
"I solved it. How you like them apples?"
Re:"Do you like apples?" (Score:4, Funny)
If they write a script for searching and trying to solve mathematical problems, they should call it Math Daemon.
Pardon my mathematical ignornance, but (Score:1)
how is problem different from creating sets of prime numbers?. I'd truly appreciate knowldegable answers; thank you.
Re:Pardon my mathematical ignornance, but (Score:5, Informative)
Never change /. (Score:2)
Multiple instances of a name with a non-ASCII character and of course /. can't display it. Is it still 1999?
I'm referring to Erdos, of course, but I replaced the non-ASCII o with a diacritical so it would be displayed in my message.
Re: Never change /. (Score:2)
You know what is funny? They could fix this using claude