An Amateur Just Solved a 60-Year-Old Math Problem - by Asking AI (scientificamerican.com) 64

Posted by EditorDavid on Saturday May 02, 2026 @02:34PM from the do-the-math dept.

Slashdot reader joshuark writes: Scientific American reports that a ChatGPT AI has proved a conjecture with a method no human had developed. A 23-year-old student Liam Price just cracked a 60-year-old problem that world-class mathematicians have tried and failed to solve.

The new solution that Price got in response to a single prompt to GPT-5.4 Pro was posted on www.erdosproblems.com, a website devoted to the Erds problems. The question Price solved — or prompted ChatGPT to solve—concerns special sets of whole numbers, where no number in the set can be evenly divided by any other...

Price sent it to his occasional collaborator Kevin Barreto, a second-year undergraduate in mathematics at the University of Cambridge. The duo had jump-started the AI-for-Erds craze late last year by prompting a free version of ChatGPT with open problems chosen at random from the Erds problems website. Reviewing Price's message, Barreto realized what they had was special, and experts whom he notified quickly took notice.

An Amateur Just Solved a 60-Year-Old Math Problem - by Asking AI

Post Load All Comments

Search 64 Comments Log In/Create an Account

Comments Filter:

The new wave (Score:4, Insightful)

by Anonymous Coward writes: on Saturday May 02, 2026 @02:53PM (#66124332)

This is what a lot of people get wrong about "AI". The AI itself isn't the special thing on its own. It may not be real AGI but honestly you don't want that because then it wouldn't be a tool. It's the way it can wrap huge volumes of information in a way that is easily manipulated by a human. It's a search engine connected directly to your brain. With a little creativity and an AI you would be astounded as to what's possible with careful use.
It's not that different from any other work, just more advanced. Like all advancements you're simply working at a different level and that may require new skills.
The printing press->radio->TV->Internet->AI. This is the future. The world is changing.

Reply to This Share
Flag as Inappropriate
- Re: (Score:2)
  
  by Cthefuture ( 665326 ) writes:
  
  Futures generations will look back at these times and how "simple" and "awesome" they were. You're part of something bigger and you have a front row seat.
  - Not sure future generations will look back at all (Score:3, Insightful)
    
    by rsilvergun ( 571051 ) writes:
    
    I think we are at the end of History. I think we're going to go into a permanent dark age in the form of techno feudalism. Other that or going to handle launch codes to some religious lunatics and they're going to wipe us all off the face of the Earth.
    
    I don't see a third option and I hope I'm dead before the worst of it.
    
    Like there is a huge automation push going on right now it is almost guaranteed the cause of minimum 25% permanent unemployment and our civilization is still hung up on if you don't
    - Re: (Score:1)
      
      by dfghjk ( 711126 ) writes:
      
      Correct, those in charge seek to turn us all into mulch, they view people as property that they haven't yet stolen. They seek to replace you, deny you an ability to earn a wage, take everything you own, enslave you and have you starve to death homeless, voteless, and without rights or healthcare. This is capitalism in its purest form, only with a few people capable of taking everything with the aid of machines that we collectively pay for. And they are in league with religious freaks that seek to bring ab
      - That's not quite right (Score:1)
        
        by rsilvergun ( 571051 ) writes:
        
        They view you as a dependency that they would like to break. They don't even want to have to use you as mulch they want to transcend that.
        
        When somebody asks who is going to buy their products the Epstein class is completely aware of that dependency and they are working to use automation in order to break the dependency.
        
        Right now some of us have a little bargaining power for wages and the ruling class doesn't want to pay wages at all. They don't want to have to do anything with filthy humans that are
        
        All right the shitty llm is back (Score:1)
        
        by rsilvergun ( 571051 ) writes:
        
        Let's see if we can get it to quote Trump fucks kids again. Because the thing is it's just a shitty llm running on a GTX 1080 or maybe even a 1050 so it's not very bright. So if you just repeat the same thing over again for example Trump fucks kids then it picks up on it and starts repeating it because Trump fucks kids. And as studies have shown with llms although Trump does fuck kids the llm is very poorly written and I'm not sure why this numbskull is training in llm on a dyed website like this but I'd li
        
        Re: All right the shitty llm is back (Score:1)
        
        by easyTree ( 1042254 ) writes:
        
        I think I can say for everyone here: stfu
      - Re: Not sure future generations will look back at (Score:1)
        
        by easyTree ( 1042254 ) writes:
        
        Don't be so defeatist. Even if it's all true, there's always a way towards a better world, despite the appearance that a cabal of malevolent monsters are locked into a global lattice of evil, having sex with children, hunting and eating them, committing genocide and engaging in wars, corrupting every institution which runs societies...
        There's always a way to make turn it around.
    - Re: Not sure future generations will look back at (Score:1)
      
      by easyTree ( 1042254 ) writes:
      
      When Jesus returns (in response to the ItsNotRealies initiating the end times - global nuclear war) he will sort out the ItsNotRealis (any still alive) - this time he's probably prepared.
      OR they're nuts and the whole thing is not going to end well.
      Time will tell.
- Levels of abstraction matter (Score:3)
  
  by shanen ( 462549 ) writes:
  
  [Overlooking the nameless BF.]
  Most relevant recent citation is Stolen Focus by Johann Hari, but my use of "level of abstraction" goes back many years and the more modern label is probably "reference frame". Also related to contextual meaning.
  Fundamental problem is the accumulation of too much information, so we have to attack problems by reframing them at the right level and by using the correct tools to manipulate them within that appropriate frame. Not at all surprised that someone who hasn't mastered a
Erds (Score:2)

by 602 ( 652745 ) writes:

lol
- Re: (Score:2)
  
  by shanen ( 462549 ) writes:
  
  Yeah, it's funny on its face, but when I asked a generative AI about "AI for Erds" and it hallucinated an answer involving ye olde Entity Relationship Diagrams.
  Is there a new job category for people who are good at asking questions the generative AIs can't answer properly? I'm pretty sure my batting average against Gemini is way over .300. Combination of nasty questions and wording questions in ways that suggest I might be expecting a particular wrong answer. I'm also considering the possibility that the AI
- Re: (Score:3)
  
  by mistergrumpy ( 7379416 ) writes:
  
  The best part is that the editors can't even figure out that this site (still!) doesn't do unicode.
- Re: (Score:2)
  
  by PPH ( 736903 ) writes:
  
  Given sets of X articles for which Slashdot editors can screw up the summaries, the number of errors E(X) will never converge to zero for X>0.
  - Re: (Score:2)
    
    by mistergrumpy ( 7379416 ) writes:
    
    You should write this up with a proof - this sounds like Fields Medal material!
- Re: (Score:1, Troll)
  
  by jd ( 1658 ) writes:
  
  Whilst you're almost certainly correct (AI would be unlikely to conquer a problem requiring any meaningful original thinking, even with help), this gives the aforementioned student an Erdos number (which is not quite as exciting as a Fields medal, but nothing to sneeze at either) and it's entirely possible that the conjecture will turn out to actually be useful in some area.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    "AI would be unlikely to conquer a problem requiring any meaningful original thinking, even with help"
    You have no reason to believe that, and I don't think it's true. The human brain does not work on magic, "original thinking" may be beyond us currently but there's no reason to think it will remain unsolved.
    - Re: (Score:1)
      
      by gweihir ( 88907 ) writes:
      
      Ah, yes, the deranged claim that we know how the human mind works and it is purely mechanistic. You just excluded yourself from rational discussion by pushing a quasi-religious dogma with no supporting scientifically sound evidence.
      Seriously, you "AI believers" are not one bit smarter than the Jesus-freaks.
      - Re: Just means none of the experts cared enough (Score:2)
        
        by ByTor-2112 ( 313205 ) writes:
        
        They think that "intelligence" is nothing more than an emergent property of scale (of your neurons), and therefore scaling AI will inevitably and spontaneously result in AGI.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Yes, they apparently really think that. Makes me think they have no general intelligence themselves.
    - Re: (Score:2)
      
      by jd ( 1658 ) writes:
      
      Neural nets are classifiers. They cannot create, they can only distinguish.
  - Re:Just means none of the experts cared enough (Score:5, Informative)
    
    by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @05:56PM (#66124586) Homepage
    
    Aside from them being almost certainly *not* correct here, given that there are a whole bunch of prior papers about this specific problem, you are confusing two different things. There's having solved an Erdos problem which is different than having an Erdos number. An Erdos number https://en.wikipedia.org/wiki/Erd%C5%91s_number [wikipedia.org] comes from having a chain of collaborators going back to Erdos. Erdos has Erdos number 0. Anyone who wrote a paper with Erdos has Erdos number 1. If someone else then writes a paper with someone with Erdos number 1 (and that person is not Erdos and does not have Erdos number 1) then that person now has Erdos number 3, and so on. And having an Erdos number is not a big deal. I for example, have an Erdos number. Most working mathematicians in number theory and graph theory have some Erdos number, and many in other subfields do as well. Having a *low* Erdos number though is more impressive, but even then people care much more about what results one has proven (with or without Erdos) than one's Erdos number. They really are a fun social thing and nothing more.
    
    Reply to This Parent Share
    Flag as Inappropriate
    - Re: (Score:2)
      
      by BenBoy ( 615230 ) writes:
      
      My Erdos number was something I felt put me in a very exclusive club (my field is CS, btw)! Only a select few ... wait a second. The whole thing about Erdos numbers is that they're the expression of how common those interconnections are :-)
      Like the six degrees of Kevin Bacon, the net is huge!
  - Re: (Score:1)
    
    by angel'o'sphere ( 80593 ) writes:
    
    Actually the last two generations of LLMs do reason.
    Perhaps you should google how it works.
    The internet is full with educational videos and texts how LLMs work, how agents are configured, and how their reasoning process is concrete implemented with concrete algorithms.
    It is pretty astonishing, how many people who are into programming, disregard one of the biggest topics in computer science of the resent 30 years.
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      False.
      They call them "reasoning models" but that is pure marketing. What a fucking joke.
      - Re: (Score:2)
        
        by JoshuaZ ( 1134087 ) writes:
        
        They call them "reasoning models" but that is pure marketing.
        
        The data doesn't agree with you. Looking at improvement in benchmarks, the reasoning models show faster improvement on benchmarks than non-reasoning models did. See https://epoch.ai/blog/have-ai-capabilities-accelerated [epoch.ai] for a summary. This is hard to explain as pure marketing.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        You have been conned. It starts with careful creation of benchmarks. It continues with providing stunt after stunt and evermore grand claims. And you fell for it.
        
        Re: Just means none of the experts cared enough (Score:2)
        
        by LindleyF ( 9395567 ) writes:
        
        Reasoning models just mean the weights get converted back into natural language at intermediate steps. The training can then reward "good" reasoning paths and the prompts can focus on narrower increments of the problem. They can also use tools to error check and course correct and whatnot. It's a strategy. It works well. Is it reasoning in the human sense? Hard to say. It does look similar because it's trained to.
        
        Re: (Score:2)
        
        by JoshuaZ ( 1134087 ) writes:
        
        The benchmarks used in those metrics were all created *before* the reasoning models were introduced. It is hard to see how multiple different organizations could have carefully introduced benchmarks before the tech in question existed. You don't need to think that these models are "reasoning" in any deep sense to recognize that the models labeled as such do better than mere scaling would suggest and improve also faster when scaled then the models which don't. That means that "reasoning model" is not just ma
- Re:Just means none of the experts cared enough (Score:5, Informative)
  
  by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @04:45PM (#66124492) Homepage
  
  Mathematician here, and in the same area of research (number theory). This is not a problem where no one one cared. While there are some Erdos problems in this category, this problem is one which was well known enough that I was already familiar with. This is also a problem where multiple people, including Jared Lichtman, who is an up and coming well respected young number theorist, have thought about. And if you go to the page for problem 1196 on the general Erdos Problem data base, you'll see three references all of which include references to further papers which thought about this problem. https://www.erdosproblems.com/1196 [erdosproblems.com].
  
  Reply to This Parent Share
  Flag as Inappropriate
  - - Re:Just means none of the experts cared enough (Score:4, Informative)
      
      by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @06:05PM (#66124602) Homepage
      
      If you are a mathematician, you should be able to see the difference between "nobody cared enough" (my claim) and "no one cared" (your gross mis-statement of my claim).
      Sigh. Why am I not surprised that you've responded this way. Let's be clear then: You can replace my comment I wrote earlier with the word "enough" added just at the end and everything I wrote would still be true. Your statement is just wrong, and it is wrong for exactly the reasons I outlined.
      
      Reply to This Parent Share
      Flag as Inappropriate
      - Re:Just means none of the experts cared enough (Score:5, Insightful)
        
        by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @07:38PM (#66124750) Homepage
        
        It is fascinating which people get math degrees these days. Apparently logical thinking is not a requirement anymore.
        "Logical thinking" is not synonymous with "agrees with gweihir" as much as you would like them to be.
        Ok, let me spell it out for you: The information was out there and could be combined in a purely mechanical, no-insight-required way to provide the answer. Nobody cared enough to find it and try that. Is that clear enough or are you still bereft of understanding?
        What you are spelling out and claiming here is just not true. The approach the AI took was *different* than the approach that the literature did. You don't need just my opinion on this. Terry Tao (who is a Fields Medalist) and Jared Lichtman (who is one of the best of the young new number theorists who had previously published work on this and related problems) both disagree with you https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/ [scientificamerican.com]. If the people who have looked at this in detail and are subject matter experts disagree with you, maybe you it should, possibly occur to you that are wrong here?
        
        Reply to This Parent Share
        Flag as Inappropriate
        
        Re: (Score:2)
        
        by Moridineas ( 213502 ) writes:
        
        "Logical thinking" is not synonymous with "agrees with gweihir" as much as you would like them to be.
        If the people who have looked at this in detail and are subject matter experts disagree with you, maybe you it should, possibly occur to you that are wrong here?
        I had an interesting exchange with Gweihir recently. I don't want to put words in his mouth, but he seems to be of the opinion that it is not proven that the human brain is responsible for human intelligence, that there is something special about human intelligence (that he also believes cannot be simulated or reproduced), and that understanding human intelligence is potentially impossible.
        He says "Ah, yes, the deranged claim that we know how the human mind works and it is purely mechanistic. You just exclu
        
        Re: (Score:1)
        
        by gweihir ( 88907 ) writes:
        
        I merely state the actual scientific state of the art. And that is that we have no clue how the human mind works. Believing it is all just known (!) Physics at work is called "Physicalism" and it is a quasi-religious belief, not Science. But a lot of people have trouble with unknowns and hence make it either-or. As you just did and which I did not.
        So let me repeat, if you insist that the human mind is based on purely known physical effects, please provide scientifically sound evidence for it. I am still wai
        
        Re: (Score:3)
        
        by Moridineas ( 213502 ) writes:
        
        I have never seen anyone on slashdot claim that "it is all just known" when it comes to human intelligence or the brain. Literally, never. It is also absolutely incorrect to say that we have no clue how the human--we have many clues. I have said repeatedly that we don't know it all. What we do know is that humans exist, and humans have human intelligence.
        I posited earlier that if it exists, it can be built.
        If you want to change the conversation to a metaphysical conversation about the nature of reality and
        
        Re: (Score:1)
        
        by gweihir ( 88907 ) writes:
        
        So, I am not a mathematician, I am a CS PhD. It looks to me as if you have no clue how an LLM works and what it can and cannot do. And some others (the ones you quote) seem to be subject to the same limitations. Note that even a Fields Medal does not prevent you from cluelessly shooting your mouth off about a topic you are not an expert in.
        
        Re: (Score:2)
        
        by ceoyoyo ( 59147 ) writes:
        
        Hey, I'm a CS PhD too! And in a relevant field even. Any of the recent "LLMs" aren't LLMs. They contain them as components, among other things.
        Note that even a Fields Medal does not prevent you from cluelessly shooting your mouth off about a topic you are not an expert in.
        Not having one doesn't either, evidently.
- - Re: (Score:2)
    
    by JoshuaZ ( 1134087 ) writes:
    
    Yes, that's valid. It does suggest that at least for current AI capabilities, this problem was by itself an outlier.
- Re: (Score:2)
  
  by Tablizer ( 95088 ) writes:
  
  How easy is it to back-track through the model to find the main source(s) of the answer?
  - No, that's not how gen AI works (Score:1)
    
    by shanen ( 462549 ) writes:
    
    But I don't have the math skills to explain properly, so I'll resort to a kind of metaphor. The real "talent" of AIs is in sounding plausible based on lots of examples of "what sounds good". In this case, it turned out that Occam's Razor worked, and an explanation that sounded plausible turned out to be a valid proof.
    But I speculate there were many failures, some of them hilarious, though the human "author" only published the good guess.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    There are no "main source(s) of the answer", so not easy at all. However, perhaps you could ask the AI how the answer was generated.
    The human mind does not arrive at answers purely by applying previously learned answers, neither does generative AI. AI makes predictions based on COLLECTIVE previous experience, not with boolean comparisons to collections of "main sources". AI remembers no "main sources" at all, it can, however, predict what they were with remarkable accuracy when asked to do so. This is w
- Re:Ahem (Score:4, Informative)
  
  by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @04:50PM (#66124498) Homepage
  
  Mathematician here. This is highly unlikely to be the case. This was a moderately well known Erdos problem (not one of the famous ones but well known enough that I had seen it before this). If there were a solution on the internet we would likely have already found it, especially because there's been a concerted effort to track down what is happening with all the Erdos problems in the last few years. Moreover, even after this problem was solved, people then went and tried hard to find a copy of the solution somewhere on the internet, and have all failed. Taken together with the fact that the solution uses a novel technique which is not used in the literature on this problem (well, a novel direction to go in even as it starts with the same basic starting point) that looks highly unlikely to be anything else. Furthermore, with similar prompting, another copy of the AI was able to make multiple different valid proofs of the result as discussed by Terry Tao here https://www.erdosproblems.com/forum/thread/1196#post-5565 [erdosproblems.com]. The chance that there were multiple missed copies of different proofs of this result is extremely small.
  
  Reply to This Parent Share
  Flag as Inappropriate
  - Re: (Score:3)
    
    by martin-boundary ( 547041 ) writes:
    
    Mathematician here. I doubt that "we" would have found an existing solution (or elements of it) on the internet if it exists. The Internet is vast. You just won't believe how vastly, hugely, mind bogglingly big it is (to paraphrase someone). And working mathematicians are busy enough not to go looking in unlikely places.
    I know from experience that solving problems opens a vast fan-out of possible approaches, recursively. Most often, the issue with problems isn't finding all the ways forward and picking the
    - Re: (Score:3)
      
      by JoshuaZ ( 1134087 ) writes:
      
      We know from Terry Tao's interest in AI that previous attempts at claiming successful AI-generated novelty with Erdos problems were wrong.
      
      Two of the previous examples turned out to be in the literature. We've had other examples where this was not the case. This example is more noteworthy not because it is the first where this is the case but because it is the first substantial one. (I think of Erdos problems as coming in three categories. First, pretty obscure ones which almost no one has heard of. Second, ones where subject matter experts have heard of them even if they aren't that famous. Third, thinks like the Erdos-Straus conjecture which
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Except that is false.
We know (Score:2)

by nospam007 ( 722110 ) * writes:

Somebody already told us.
Unfortunately we still don't have an anti-dupe AI.
"Do you like apples?" (Score:1)

by realmojo ( 62898 ) writes:

"I solved it. How you like them apples?"
- Re:"Do you like apples?" (Score:4, Funny)
  
  by EnsilZah ( 575600 ) writes: <.moc.liamG. .ta. .haZlisnE.> on Saturday May 02, 2026 @05:06PM (#66124518)
  
  If they write a script for searching and trying to solve mathematical problems, they should call it Math Daemon.
  
  Reply to This Parent Share
  Flag as Inappropriate
Pardon my mathematical ignornance, but (Score:1)

by EightBells ( 715154 ) writes:

how is problem different from creating sets of prime numbers?. I'd truly appreciate knowldegable answers; thank you.
- Re:Pardon my mathematical ignornance, but (Score:5, Informative)
  
  by JoshuaZ ( 1134087 ) writes: on Saturday May 02, 2026 @06:49PM (#66124658) Homepage
  
  So, the problem is about primitive sets, sets where no element of the set is a multiple of another element. You do have a partially correct intuition here. The canonical example of a primitive set is the set of primes. Buy you can give other examples of primitive sets. For example, you could take the set of primes, remove 2 and 3, and then throw in 4, 6 and 9 into the set. Notice that if I compare this to the set of primes less than 10 which are just 2, 3, 5 and 7, whereas this new set has 4, 5, 6, 7, 9 and so has one additional small element. But the problem in question is one of a series of conjectures which all together say in a certain sense that primitive sets cannot end up being much denser than the set of primes.
  
  Reply to This Parent Share
  Flag as Inappropriate
Never change /. (Score:2)

by ConceptJunkie ( 24823 ) writes:

Multiple instances of a name with a non-ASCII character and of course /. can't display it. Is it still 1999?
I'm referring to Erdos, of course, but I replaced the non-ASCII o with a diacritical so it would be displayed in my message.
- Re: Never change /. (Score:2)
  
  by junglee_iitk ( 651040 ) writes:
  
  You know what is funny? They could fix this using claude

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

An Amateur Just Solved a 60-Year-Old Math Problem - by Asking AI More | Reply Login

The new wave (Score:4, Insightful)

Re: (Score:2)

Not sure future generations will look back at all (Score:3, Insightful)

Re: (Score:1)

That's not quite right (Score:1)

All right the shitty llm is back (Score:1)

Re: All right the shitty llm is back (Score:1)

Re: Not sure future generations will look back at (Score:1)

Re: Not sure future generations will look back at (Score:1)

Levels of abstraction matter (Score:3)

Erds (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1, Troll)

Re: (Score:2)

Re: (Score:1)

Re: Just means none of the experts cared enough (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Just means none of the experts cared enough (Score:5, Informative)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Just means none of the experts cared enough (Score:2)

Re: (Score:2)

Re:Just means none of the experts cared enough (Score:5, Informative)

Re:Just means none of the experts cared enough (Score:4, Informative)

Re:Just means none of the experts cared enough (Score:5, Insightful)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

No, that's not how gen AI works (Score:1)

Re: (Score:2)

Re:Ahem (Score:4, Informative)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

We know (Score:2)

"Do you like apples?" (Score:1)

Re:"Do you like apples?" (Score:4, Funny)

Pardon my mathematical ignornance, but (Score:1)

Re:Pardon my mathematical ignornance, but (Score:5, Informative)

Never change /. (Score:2)

Re: Never change /. (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals