



Could a 'Math Genius' AI Co-author Proofs Within Three Years? (theregister.com) 42
A new DARPA project called expMath "aims to jumpstart math innovation with the help of AI," writes The Register. America's "Defense Advanced Research Projects Agency" believes mathematics isn't advancing fast enough, according to their article...
So to accelerate — or "exponentiate" — the rate of mathematical research, DARPA this week held a Proposers Day event to engage with the technical community in the hope that attendees will prepare proposals to submit once the actual Broad Agency Announcement solicitation goes out...
[T]he problem is that AI just isn't very smart. It can do high school-level math but not high-level math. [One slide from DARPA program manager Patrick Shafto noted that OpenAI o1 "continues to abjectly fail at basic math despite claims of reasoning capabilities."] Nonetheless, expMath's goal is to make AI models capable of:
- auto decomposition — automatically decompose natural language statements into reusable natural language lemmas (a proven statement used to prove other statements); and
auto(in)formalization — translate the natural language lemma into a formal proof and then translate the proof back to natural language.
"How must faster with technology advance with AI agents solving new mathematical proofs?" asks former DARPA research scientist Robin Rowe (also long-time Slashdot reader robinsrowe): DARPA says that "The goal of Exponentiating Mathematics is to radically accelerate the rate of progress in pure mathematics by developing an AI co-author capable of proposing and proving useful abstractions."
Rowe is cited in the article as the founder/CEO of an AI research institute named "Fountain Adobe". (He tells The Register that "It's an indication of DARPA's concern about how tough this may be that it's a three-year program. That's not normal for DARPA.") Rowe is optimistic. "I think we're going to kill it, honestly. I think it's not going to take three years. But I think it might take three years to do it with LLMs. So then the question becomes, how radical is everybody willing to be?"
"We will robustly engage with the math and AI communities toward fundamentally reshaping the practice of mathematics by mathematicians," explains the project's home page. They've already uploaded an hour-long video of their Proposers Day event.
"It's very unclear that current AI systems can succeed at this task..." program manager Shafto says in a short video introducing the project. But... "There's a lot of enthusiasm in the math community for the possibility of changes in the way mathematics is practiced. It opens up fundamentally new things for mathematicians. But of course, they're not AI researchers. One of the motivations for this program is to bring together two different communities — the people who are working on AI for mathematics, and the people who are doing mathematics — so that we're solving the same problem.
At its core, it's a very hard and rather technical problem. And this is DARPA's bread-and-butter, is to sort of try to change the world. And I think this has the potential to do that.
[T]he problem is that AI just isn't very smart. It can do high school-level math but not high-level math. [One slide from DARPA program manager Patrick Shafto noted that OpenAI o1 "continues to abjectly fail at basic math despite claims of reasoning capabilities."] Nonetheless, expMath's goal is to make AI models capable of:
- auto decomposition — automatically decompose natural language statements into reusable natural language lemmas (a proven statement used to prove other statements); and
auto(in)formalization — translate the natural language lemma into a formal proof and then translate the proof back to natural language.
"How must faster with technology advance with AI agents solving new mathematical proofs?" asks former DARPA research scientist Robin Rowe (also long-time Slashdot reader robinsrowe): DARPA says that "The goal of Exponentiating Mathematics is to radically accelerate the rate of progress in pure mathematics by developing an AI co-author capable of proposing and proving useful abstractions."
Rowe is cited in the article as the founder/CEO of an AI research institute named "Fountain Adobe". (He tells The Register that "It's an indication of DARPA's concern about how tough this may be that it's a three-year program. That's not normal for DARPA.") Rowe is optimistic. "I think we're going to kill it, honestly. I think it's not going to take three years. But I think it might take three years to do it with LLMs. So then the question becomes, how radical is everybody willing to be?"
"We will robustly engage with the math and AI communities toward fundamentally reshaping the practice of mathematics by mathematicians," explains the project's home page. They've already uploaded an hour-long video of their Proposers Day event.
"It's very unclear that current AI systems can succeed at this task..." program manager Shafto says in a short video introducing the project. But... "There's a lot of enthusiasm in the math community for the possibility of changes in the way mathematics is practiced. It opens up fundamentally new things for mathematicians. But of course, they're not AI researchers. One of the motivations for this program is to bring together two different communities — the people who are working on AI for mathematics, and the people who are doing mathematics — so that we're solving the same problem.
At its core, it's a very hard and rather technical problem. And this is DARPA's bread-and-butter, is to sort of try to change the world. And I think this has the potential to do that.
Hallucinations (Score:3, Interesting)
Re: (Score:3)
I think that's why they wrote co-author. They probably still need to be accepted by an expert reader.
Note that there are already a lot of computer aided proofs out there. Often they formalize the proof in Coq. Traditionally, the proof where computer generated by combinatorial explosion, or by reinforcement learning. But an LLM to conceptualize proof and write parts in Coq seem quite possible.
Re:Hallucinations (Score:4, Insightful)
Then there's no co-authoring. It's just an author using a tool. A tool that's still stupidly being referred to as AI.
Re: (Score:2)
Re: (Score:2, Insightful)
AI makes up things 100% of the time. That is literally their purpose. A hallucination is when you don't like the answer.
"It seems likely that an AI faced with a problem and not having the required tools would simply hallucinate them."
So long as there's no one around to know the difference, that's good enough. -Sam Altman
Re: (Score:1)
AI makes up things 100% of the time. That is literally their purpose. A hallucination is when you don't like the answer.
Perhaps you should take at least a beginners course about "AI"?
"It seems likely that an AI faced with a problem and not having the required tools would simply hallucinate them." /FACEPALM
That is nonsense. It will tell you that it does "not know it".
P.S. why do people who hate AI - but never used it - feel competent to discuss about "hallucination"?
Hallucination:
The AI talks about stuff tha
Re: (Score:2)
Here's a fact that does advance the conversation: even when an LLM tells you it does "not know it", you can't be sure if that's actually true or not. Like the rest of the sentences it produces, th
Re: (Score:3)
That's just a feature of Large Language Models.
Game-playing AI instead tries to figure out the "best" solution based on what inputs it has. If it tries making something up (e.g. attempts an illegal move in chess), the game server will catch it and simply reject the move, while other AIs will have already looked at more valid options.
No, a hallucination is the AI generating something completely false, as demonstrated by ye [futurism.com]
Re:Hallucinations (Score:4, Informative)
By running a theorem prover over the list of formulars they spit out. /FACEPALM
That is simple math.
Or in other words, in case you really are out of the loop: finding a proof might need a lot of creativity. Proving the proof is like executing a computer program: either it comes to the end and halts, or there are instructions in the middle that can not be followed.
We have math theorem provers since 50 years or more.
Example:
a <- true;
b <- false;
$: c := a & b ?
I claim in my prove, that c is false. As the terms a and b combined with logical and will yield false.
Any theorem prover will follow that logic and execute the formulars (after normalizing them, reducing common terms etc.) and come to the same conclusion.
As long as you can formalize the chain of reasoning, it does not matter if an AI or a human constructed that chain. Just like a compiler does not care if the code was written by a human or not.
Not LLMs (Score:3)
Re: (Score:2)
Re: (Score:2)
Hallucinate enough times and you're bound to have one correct proof. ;)
Re: Hallucinations (Score:2)
Re: (Score:2)
Absolutely no problem. Proof-checkers are a solved problem and several well-working tools exist and are in use. The problem is that "AI" cannot find proofs for anything not trivial. It is likely to be far worse than most of the other approaches that have been tried in the last 50 years and they all essentially failed. Nontrivial mathematical proofs require insight and one or several new elements and approaches. It is not just assembling existing components or variants thereof. It is far, far harder.
Hence, u
Too bad Ted Kaczynski passed away in 2023 (Score:1)
Too bad Ted Kaczynski passed away in 2023, I would have been curious to ask him the question. /s
https://en.wikipedia.org/wiki/... [wikipedia.org]
He was a mathematics prodigy, but abandoned his academic career in 1969 to pursue a reclusive primitive lifestyle and lone wolf
So sad...
Re: (Score:2)
Why can't everyone be more like Grigori Perelman?
" in 2006 stated that he had quit professional mathematics, owing to feeling disappointed over the ethical standards in the field. He lives in seclusion in Saint Petersburg"
Headline asks a question (Score:2)
The answer is no.
Re: (Score:1)
The answer is no.
I agree. But not just because of Betteridge's law. https://en.wikipedia.org/wiki/... [wikipedia.org]
An AI can't author a mathematical proof any more than an Excel spreadsheet. The "math genius" in the AI was brought to it by humans, the software just ran through the information in a methodical manner as dictated by the programming. The computers and software were a mechanical aid to some heavy lifting just like a calculator is to a math problem or a forklift is to a stack of bricks.
Could we see some "true genius" from
what does he mean by radical? (Score:2)
'So then the question becomes, how radical is everybody willing to be?"'
Is that really the question though? AI is deterministic software, would we ask this question about any other deterministic software?
Or maybe he means how far are you willing to toe with the lies?
The anthropomorphizing of software needs to end.
Re: (Score:3, Informative)
AI is deterministic software, would we ask this question about any other deterministic software?
Depends what you call AI.
At the moment we are talking about LLMs ... large language models. So no, it is not deterministic software.
Which you can easy try yourself. Ask the exact same "creative" question to the same model from two different computers.
Re: (Score:2)
They professed optimism that it would be done in less than three years; but that it probably would take three years to get there with an LLM; then asked how radical people were willing to be; which sounds like a relatively polite phrasing of the idea that someone who wants to make real progress is going to need to take an approach that isn't a me-too LLM attempt.
Re: (Score:2)
There is nothing "radical" here. This is just another lying asshole scammer trying to sound bombastic to push LLMs. If software could find mathematical proofs, _nobody_ would have any problems with these proofs or them getting published. Nobody. But, guess what, intense efforts in that direction have now consistently failed for something like 50 years, because this is not something that can be solved efficiently by software of any kind with any known approaches including LLMs. What we have is proof-checkers
The title of the thread seems like nonsense to me. (Score:2)
This is all great and stuff, but. (Score:3)
Re: (Score:2)
There was a great post ages ago from someone here. I cannot remember who said it so I'll paraphrase without attribution in the hope that it will be forthcoming.
The poster pointed out that 1950s sci-fi revealed biases about how hard tasks are (with women's tasks being easy and men's tasks being hard). In the future, domestic chores are done by robot, but astronavigation is done by smart men with a pencil and paper.
Turns out astronavigation is much better done by computer and domestic chores are relegated to
Not a chance. (Score:2)
Not a chance of this happening, because the current state of AI doesn't have the concept of a popular, high-probability concept not being the correct next step just because it happens to also be false.
Re: (Score:2)
That actually is not a problem. Proof _checking_ tools are well established and work really well. What is the problem is that AI will run into state-soace explosion, just the same as the one that makes any other proof0finding tools almost useless.
1 is not a prime number (Score:2)
Is what Gemini tied to convince me of a few weeks ago.
So yes, the AIs can write all the proofs they want. Correct ones ? I have my doubts.
Re: (Score:2)
Well, to be honest, this depends on the definition of prime number. According to the most widespread one, the quality of being prime or not applies only to numbers higher than 1, so in this particular case, the AI was right. Maybe it took the answer directly from Wikipedia, but still right. AI can easily be right, when the answer to the question it's being asked can be found verbatim in sites like Wikipedia. Ask it a question whose answer is not commonly found in its data loot^H^H^H^Hstash, and watch halluc
Co-author? (Score:2)
CO-author? Sure. A human author can use any minimal amount of LLM generated text and put the LLM's name on the paper to get attention. Not sure why that would be notable though...
No, and. (Score:2)
If you're asking this question... a device for doing arithmetic already exists, it's called a calculator.
If you know what a math proof is, why would you be asking this question?
Re: (Score:2)
If you know what a math proof is, why would you be asking this question?
Indeed. Most people do not have any idea though and have no clue what makes finding math proofs difficult. Or that automating this has been tried and failed for something like 50 years now, due to combinatoric explosion. Proof-checking is, on the other hand, well established and there is zero need for any LLM tools.
Could a 'Math Genius' AI Co-author Proofs Within T (Score:3)
Of course it could. The more interesting question is how it will be acknowledged, by mathematicians, by the AI interests, and by the media.
As with many other fields, computers offer the opportunity to search vast mathematical spaces that are beyond the ability of humans to investigate.
But the as yet unanswered question is how to tell a computer to look for something "interesting". In searches for drugs, for example, we might ask the computer to find a drug that will bind to a particular site or otherwise disrupt a metabolic pathway, but ask it to find *new* interesting ideas is beyond description, let alone execution.
This isn't particularly surprising, it's a vanishingly rare skill even amongst humans skilled in the field.
And so it will be with AI proofs. There will be great fanfare the first time an AI proves, with or without guidance from top mathematicians, some previously unsolved conjecture. "ChatGPT proves the Goldbach Conjecture! AI surpasses human intelligence!" but assuming that the Goldbach conjecture can be proved using already known mathematics then all the work is done, and all that remains is searching, something that AI is particularly good at. The search space is vast however, and it's just as likely that the only way an AI can find a proof is with the intuition of genius mathematicians stopping it going off down unproductive routes. The AI will find the proof but it may, or may not be something it could have done alone.
And, should it happen, it's quite likely that the AI proof will spur countless interesting discoveries by the mathematicians skilled in the field once they see the "trick" to solving that problem. Possibly half a dozen other unsolved problems might fall, not due to AI but due to *understanding*.
The second thing that will happen, but probably will slip most of us by because it won't make the mainstream press and we don't read mathematics journals, is some deep and profound new conjectures (with or without proof) that will be mostly the work of mathematicians but with a good dose of AI assist. This may, or may not, get credited to AI but will involve the sort of intelligent searching that AI can deliver on a scale that isn't possible by humans due to the limited number of them, their low speed relative to computers, and their propensity to make mistakes, particularly when doing boring grunt work.
The final thing that probably will happen eventually but doesn't appear to be on the horizon, is AIs coming up with new and profound conjectures absent the guidance of mathematicians. This is going to be hard to identify, not least because there's a vast difference between a computer coming up with 200 "conjectures", 198 of which are "completely wrong headed", 1 of which is already known and 1 of which the mathematicians looking at its output say "Ohhh, that's interesting" and an AI coming up with 3 conjectures, 2 of which are already known/proved and one is new, novel and interesting.
Re: (Score:2)
Not! It's nothing more than a tool used by an author. Like reference text books. Citing the LLM's sources would be a good idea of course.
Of course. (?) (Score:2)
Math is strict rules, applied. Of course an AI could and very likely will co-author proofs and test them, if only by brute force. If you build a solid and well-contrained math AI and run it against more or less random math hypothesis you're likely to come up with new proofs and even theorems by mere chance. And pretty quickly too. A dedicated high-school kid could probably do this.
Having the AI describe the proof in human-readable form is basically a solved problem by now.
We're talking about AI here. The su
Re: (Score:2)
Math is strict rules, applied. Of course an AI could and very likely will co-author proofs and test them, if only by brute force.
Nonsense. Nothing you an publish is even remotely within reach of brute-force approaches.
Re: (Score:2)
Good thing verifying formal proofs is basically mechanical work and the tools to do it automatically exist already.
Forget it (Score:2)
At least for the math. Proof-checking is not an LLM-task and well established tools exist. Proof generation is far, far outside of what LLMs can do except for toy examples. This area has been subject of intense research for 50 years and all approaches die early due to combinatoric explosion. LLMs will do _worse_ than what has been tried so far as they are even less able to target their next steps and are considerably slower when diving.
This is just another idiotic AI-hype story that is completely ignorant o