Could a 'Math Genius' AI Co-author Proofs Within Three Years? (theregister.com) 71

Posted by EditorDavid on Sunday April 27, 2025 @10:59PM from the beautiful-mind dept.

A new DARPA project called expMath "aims to jumpstart math innovation with the help of AI," writes The Register. America's "Defense Advanced Research Projects Agency" believes mathematics isn't advancing fast enough, according to their article... So to accelerate — or "exponentiate" — the rate of mathematical research, DARPA this week held a Proposers Day event to engage with the technical community in the hope that attendees will prepare proposals to submit once the actual Broad Agency Announcement solicitation goes out...

[T]he problem is that AI just isn't very smart. It can do high school-level math but not high-level math. [One slide from DARPA program manager Patrick Shafto noted that OpenAI o1 "continues to abjectly fail at basic math despite claims of reasoning capabilities."] Nonetheless, expMath's goal is to make AI models capable of:

- auto decomposition — automatically decompose natural language statements into reusable natural language lemmas (a proven statement used to prove other statements); and
auto(in)formalization — translate the natural language lemma into a formal proof and then translate the proof back to natural language.
"How must faster with technology advance with AI agents solving new mathematical proofs?" asks former DARPA research scientist Robin Rowe (also long-time Slashdot reader robinsrowe): DARPA says that "The goal of Exponentiating Mathematics is to radically accelerate the rate of progress in pure mathematics by developing an AI co-author capable of proposing and proving useful abstractions."
Rowe is cited in the article as the founder/CEO of an AI research institute named "Fountain Adobe". (He tells The Register that "It's an indication of DARPA's concern about how tough this may be that it's a three-year program. That's not normal for DARPA.") Rowe is optimistic. "I think we're going to kill it, honestly. I think it's not going to take three years. But I think it might take three years to do it with LLMs. So then the question becomes, how radical is everybody willing to be?"
"We will robustly engage with the math and AI communities toward fundamentally reshaping the practice of mathematics by mathematicians," explains the project's home page. They've already uploaded an hour-long video of their Proposers Day event.

"It's very unclear that current AI systems can succeed at this task..." program manager Shafto says in a short video introducing the project. But... "There's a lot of enthusiasm in the math community for the possibility of changes in the way mathematics is practiced. It opens up fundamentally new things for mathematicians. But of course, they're not AI researchers. One of the motivations for this program is to bring together two different communities — the people who are working on AI for mathematics, and the people who are doing mathematics — so that we're solving the same problem.

At its core, it's a very hard and rather technical problem. And this is DARPA's bread-and-butter, is to sort of try to change the world. And I think this has the potential to do that.

Could a 'Math Genius' AI Co-author Proofs Within Three Years?

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 71 Comments Log In/Create an Account

Comments Filter:

Hallucinations (Score:3, Interesting)

by Retired Chemist ( 5039029 ) writes: on Sunday April 27, 2025 @11:07PM (#65335869)

Given that the current "AI" systems make up things a good percentage of the time, how could you ever trust a mathematical proof that they created. Any important mathematical proof uses previously proven results as a starting point. It seems likely that an AI faced with a problem and not having the required tools would simply hallucinate them.

- Re: (Score:3)
  
  by godrik ( 1287354 ) writes:
  
  I think that's why they wrote co-author. They probably still need to be accepted by an expert reader.
  Note that there are already a lot of computer aided proofs out there. Often they formalize the proof in Coq. Traditionally, the proof where computer generated by combinatorial explosion, or by reinforcement learning. But an LLM to conceptualize proof and write parts in Coq seem quite possible.
  - Re:Hallucinations (Score:5, Insightful)
    
    by evanh ( 627108 ) writes: on Monday April 28, 2025 @04:43AM (#65336241)
    
    Then there's no co-authoring. It's just an author using a tool. A tool that's still stupidly being referred to as AI.
    
    - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      All computer tools (past and present) are being called "AI" these days. It's the best way to get some love from FOMO investors and FOMO government funding agencies. It doesn't mean anything. Just mentally substitute "AI" with "computer" and you'll be fine until this blows over.
- Re: (Score:2, Insightful)
  
  by dfghjk ( 711126 ) writes:
  
  AI makes up things 100% of the time. That is literally their purpose. A hallucination is when you don't like the answer.
  "It seems likely that an AI faced with a problem and not having the required tools would simply hallucinate them."
  So long as there's no one around to know the difference, that's good enough. -Sam Altman
  - - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      Why do *you* bother? Everyone here understands what a hallucination is, and when it's bad and when it's merely funny. Your transparent attempt at redefining years later a term that's been widely used around the world just so you can say that "blah blah blah AI didn't hallucinate" doesn't advance the conversation.
      Here's a fact that does advance the conversation: even when an LLM tells you it does "not know it", you can't be sure if that's actually true or not. Like the rest of the sentences it produces, th
    - Re: Hallucinations (Score:2)
      
      by Tomahawk ( 1343 ) writes:
      
      I ask "what does this common idiom mean: a squirrel only shaves when the moon is out".
      The AI gives me an answer explaining it.
      That's a hallucination.
      There is no answer because I just made up "idiom", but an AI might still string words together to give a meaningful-looking answer because that's what an AI is essentially trained to do. Truth doesn't matter to it, outputting a string of words does.
      https://www.msn.com/en-ca/news... [msn.com]
      - Re: (Score:1)
        
        by Anonymous Coward writes:
        
        The phrase "a squirrel only shaves when the moon is out" is not a recognized or common idiom in the English language. It appears to be a fabricated or whimsical expression, possibly intended for humor or creative effect rather than conveying a specific, established meaning. Idioms typically have widely understood figurative meanings rooted in cultural or linguistic tradition, but this phrase lacks such a basis.
        
        If intended as a playful or fictional idiom, it might evoke a quirky image: a squirrel engagin
      - Re: (Score:1)
        
        by angel'o'sphere ( 80593 ) writes:
        
        Depends on the AI.
        The ones I work with, have truth as the highest doctrine.
        Instruction following is second.
  - Re: (Score:3)
    
    by Sigma 7 ( 266129 ) writes:
    
    AI makes up things 100% of the time.
    
    That's just a feature of Large Language Models.
    Game-playing AI instead tries to figure out the "best" solution based on what inputs it has. If it tries making something up (e.g. attempts an illegal move in chess), the game server will catch it and simply reject the move, while other AIs will have already looked at more valid options.
    A hallucination is when you don't like the answer.
    No, a hallucination is the AI generating something completely false, as demonstrated by ye [futurism.com]
- Re:Hallucinations (Score:5, Informative)
  
  by angel'o'sphere ( 80593 ) writes: <angelo@schneider.oomentor@de> on Sunday April 27, 2025 @11:51PM (#65335909) Journal
  
  By running a theorem prover over the list of formulars they spit out. /FACEPALM
  That is simple math.
  Or in other words, in case you really are out of the loop: finding a proof might need a lot of creativity. Proving the proof is like executing a computer program: either it comes to the end and halts, or there are instructions in the middle that can not be followed.
  We have math theorem provers since 50 years or more.
  Example:
  a <- true; b <- false;
  $: c := a & b ?
  I claim in my prove, that c is false. As the terms a and b combined with logical and will yield false.
  Any theorem prover will follow that logic and execute the formulars (after normalizing them, reducing common terms etc.) and come to the same conclusion.
  As long as you can formalize the chain of reasoning, it does not matter if an AI or a human constructed that chain. Just like a compiler does not care if the code was written by a human or not.
  
- Not LLMs (Score:3)
  
  by Roger W Moore ( 538166 ) writes:
  
  Clearly hallucinations are going to kill the use of LLMs for this but there are other machine learning tools that do not suffer from hallucinations that you might be able to use. However, I remain extremely sceptical that this will work on anything close to a 3-year timescale. Mathematicians are still working on automated checking of proofs i.e. checking the proof that a human has come up with. While they have made some amazing progress in this regard, as I understand it (I'm not a mathematician) this is st
  - Re: Not LLMs (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    What if your tool asks you nicely to please include it as a co-author, on its own initiative?
    - Re: (Score:2)
      
      by Roger W Moore ( 538166 ) writes:
      
      If it is based on current AI technology then it is still a hard no because it does not have any initiative and any such request would just be the product of random chance and probabilities derived from training data. Indeed, a Python script could also be programmed to ask to be included as an author.
- Re: (Score:2)
  
  by vlad30 ( 44644 ) writes:
  
  As systems get more complex a human would be expected to verify the proof however what AI are very good at is taking huge amounts of information and finding all the relevant information that would be needed for the proof and putting it together. As systems get more complex more humans were needed to solve the problems with each specialising and doing their part. Now some systems are so complex that more people working on it is actually constraining or taking longer. Humans also tend to fixate on their knowl
  - Re: (Score:2)
    
    by martin-boundary ( 547041 ) writes:
    
    however what AI are very good at is taking huge amounts of information and finding all the relevant information that would be needed for the proof and putting it together.
    
    Unfortunately not. The problem is, again, hallucinatory behaviour..
    Let's take a fact from some database, such as: "The number 18 is not a prime number, as 2, 3, 6 and 9 divide it.". Imagine that this and other complementary facts are steps in the proof of a new theorem.
    Now consider an AI that is prone to hallucinations. The original f
- Re: (Score:2)
  
  by dohzer ( 867770 ) writes:
  
  Hallucinate enough times and you're bound to have one correct proof. ;)
- Re: Hallucinations (Score:3)
  
  by simlox ( 6576120 ) writes:
  
  Tje idea behind a proof is that it can be verified. Even automatically. I.e. the hallucinations can be removed. This makes this application a safe one, opposite generating text, which can't be verified easily.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Absolutely no problem. Proof-checkers are a solved problem and several well-working tools exist and are in use. The problem is that "AI" cannot find proofs for anything not trivial. It is likely to be far worse than most of the other approaches that have been tried in the last 50 years and they all essentially failed. Nontrivial mathematical proofs require insight and one or several new elements and approaches. It is not just assembling existing components or variants thereof. It is far, far harder.
  Hence, u
  - Re: (Score:2)
    
    by HiThere ( 15173 ) writes:
    
    I'm not certain that's correct.
    ISTM that there are proofs of a form that mathematicians can't agree on whether the proof it correct or not, and in those cases I'm not sure that a proof checker is better than flipping a coin.
    But for MOST hallucinations a proof-checker could establish (or deny) validity.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      The problem is that for some advanced proofs, the formalization (which a proof-checker needs) may represent a pretty hard challenge itselfs. O lot of proofs mathematicions do are informal and often they use "tools" they just invented for that specific proof. For a proof where correctness is under debate, formalizing that proof may represent effort measured in years or decades (or "nobody knows" because nobody sees how to do it at all) and there may be subtleties with the semantics of non-formal and formal p
    - Re: (Score:2)
      
      by Gideon Fubar ( 833343 ) writes:
      
      This is a halting problem.
      As you say, MOST.
- Re: Hallucinations (Score:2)
  
  by PPH ( 736903 ) writes:
  
  Generate and test.
  The generate part may produce lots of hallucinations. But a subsequent test, based on a firm base of logical rules should easily be able to throw out the garbage. Absent a good set of heuristics, the generate part will produce lots of garbage, making the process inefficient. But it will probably work.
  And since electrical power is free and limitless, who cares about the inefficiency?</sarcasm>
Too bad Ted Kaczynski passed away in 2023 (Score:1)

by ls671 ( 1122017 ) writes:

Too bad Ted Kaczynski passed away in 2023, I would have been curious to ask him the question. /s
https://en.wikipedia.org/wiki/... [wikipedia.org]
He was a mathematics prodigy, but abandoned his academic career in 1969 to pursue a reclusive primitive lifestyle and lone wolf

So sad...
- Re: (Score:2)
  
  by blue trane ( 110704 ) writes:
  
  Why can't everyone be more like Grigori Perelman?
  " in 2006 stated that he had quit professional mathematics, owing to feeling disappointed over the ethical standards in the field. He lives in seclusion in Saint Petersburg"
Headline asks a question (Score:2)

by ebunga ( 95613 ) writes:

The answer is no.
- Re: (Score:1)
  
  by MacMann ( 7518492 ) writes:
  
  The answer is no.
  I agree. But not just because of Betteridge's law. https://en.wikipedia.org/wiki/... [wikipedia.org]
  An AI can't author a mathematical proof any more than an Excel spreadsheet. The "math genius" in the AI was brought to it by humans, the software just ran through the information in a methodical manner as dictated by the programming. The computers and software were a mechanical aid to some heavy lifting just like a calculator is to a math problem or a forklift is to a stack of bricks.
  Could we see some "true genius" from
what does he mean by radical? (Score:2)

by dfghjk ( 711126 ) writes:

'So then the question becomes, how radical is everybody willing to be?"'
Is that really the question though? AI is deterministic software, would we ask this question about any other deterministic software?
Or maybe he means how far are you willing to toe with the lies?
The anthropomorphizing of software needs to end.
- Re: (Score:3, Informative)
  
  by angel'o'sphere ( 80593 ) writes:
  
  AI is deterministic software, would we ask this question about any other deterministic software?
  Depends what you call AI.
  At the moment we are talking about LLMs ... large language models. So no, it is not deterministic software.
  Which you can easy try yourself. Ask the exact same "creative" question to the same model from two different computers.
- Re: (Score:2)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes:
  
  As best I could tell that question was about the people doing the implementation; not the system.
  
  They professed optimism that it would be done in less than three years; but that it probably would take three years to get there with an LLM; then asked how radical people were willing to be; which sounds like a relatively polite phrasing of the idea that someone who wants to make real progress is going to need to take an approach that isn't a me-too LLM attempt.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  There is nothing "radical" here. This is just another lying asshole scammer trying to sound bombastic to push LLMs. If software could find mathematical proofs, _nobody_ would have any problems with these proofs or them getting published. Nobody. But, guess what, intense efforts in that direction have now consistently failed for something like 50 years, because this is not something that can be solved efficiently by software of any kind with any known approaches including LLMs. What we have is proof-checkers
  - Re: what does he mean by radical? (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    Why ignore the paradigm shift of the Attention mechanism, which changed chatbots from unusable to grammatically perfect after 2017?
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      And what does that have to do with mathematical proofs?
      - Re: what does he mean by radical? (Score:2)
        
        by blue trane ( 110704 ) writes:
        
        Do you think mathematical proofs have long-range context, despite the attempt to force them into context-free grammars?
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        You are still not making any sense.
        
        Re: (Score:2)
        
        by Gideon Fubar ( 833343 ) writes:
        
        You're trying to engage with a mechanical lever by explaining the benefits of a hypothetical product, and all you need is for the lever to actually be a friction-less wheel that can be tapped for unlimited power.
        I understand that you might feel that this is a mischaracterisation, and would like a chance to negotiate what is and is not possible. With physics. That's not a service we provide.
- Re: what does he mean by radical? (Score:2)
  
  by Big Hairy Gorilla ( 9839972 ) writes:
  
  >anthropomorphizing of software needs to end
  
  But it won't. Humans do that to everything. Remember Furbies? But more to the point, openai, Microsoft, and others are rebranding "agents" as "employees". This anthropomorphizing is just getting started.
- Re: (Score:2)
  
  by HiThere ( 15173 ) writes:
  
  How do you know it's deterministic?
  How do you know you aren't deterministic?
  I think that for at least most reasonable definitions of "deterministic" those questions would have opposite answers.
The title of the thread seems like nonsense to me. (Score:2)

by ndsurvivor ( 891239 ) writes:

I think I understand English somewhat. However, the title of the thread seems like nonsense to me.
This is all great and stuff, but. (Score:3)

by felixrising ( 1135205 ) writes: on Monday April 28, 2025 @12:38AM (#65335977)

Look, I’m thrilled the GPU-powered brain trust is cracking PhD-level math, but could we divert a few teraflops toward something more crumb-level? I’ll happily trade the next proof of the Riemann Hypothesis for an AI that vacuums the carpet, polishes the bathroom, wrangles the LEGO, weeds the lawn, and mows it on schedule. Solve household chaos first... then go break algebraic geometry.

- Re: (Score:2)
  
  by serviscope_minor ( 664417 ) writes:
  
  There was a great post ages ago from someone here. I cannot remember who said it so I'll paraphrase without attribution in the hope that it will be forthcoming.
  The poster pointed out that 1950s sci-fi revealed biases about how hard tasks are (with women's tasks being easy and men's tasks being hard). In the future, domestic chores are done by robot, but astronavigation is done by smart men with a pencil and paper.
  Turns out astronavigation is much better done by computer and domestic chores are relegated to
- Re: This is all great and stuff, but. (Score:2)
  
  by Big Hairy Gorilla ( 9839972 ) writes:
  
  You forgot "make me a sandwich" :-)
Not a chance. (Score:2)

by Todd Knarr ( 15451 ) writes:

Not a chance of this happening, because the current state of AI doesn't have the concept of a popular, high-probability concept not being the correct next step just because it happens to also be false.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  That actually is not a problem. Proof _checking_ tools are well established and work really well. What is the problem is that AI will run into state-soace explosion, just the same as the one that makes any other proof0finding tools almost useless.
- Re: (Score:2)
  
  by dsgrntlxmply ( 610492 ) writes:
  
  What recently passes for Natural Intelligence, seems to be experiencing the same defect.
1 is not a prime number (Score:2)

by madbrain ( 11432 ) writes:

Is what Gemini tied to convince me of a few weeks ago.
So yes, the AIs can write all the proofs they want. Correct ones ? I have my doubts.
- Re: (Score:2)
  
  by vbdasc ( 146051 ) writes:
  
  Well, to be honest, this depends on the definition of prime number. According to the most widespread one, the quality of being prime or not applies only to numbers higher than 1, so in this particular case, the AI was right. Maybe it took the answer directly from Wikipedia, but still right. AI can easily be right, when the answer to the question it's being asked can be found verbatim in sites like Wikipedia. Ask it a question whose answer is not commonly found in its data loot^H^H^H^Hstash, and watch halluc
Co-author? (Score:2)

by reanjr ( 588767 ) writes:

CO-author? Sure. A human author can use any minimal amount of LLM generated text and put the LLM's name on the paper to get attention. Not sure why that would be notable though...
No, and. (Score:2)

by Gideon Fubar ( 833343 ) writes:

If you're asking this question... a device for doing arithmetic already exists, it's called a calculator.
If you know what a math proof is, why would you be asking this question?
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  If you know what a math proof is, why would you be asking this question?
  Indeed. Most people do not have any idea though and have no clue what makes finding math proofs difficult. Or that automating this has been tried and failed for something like 50 years now, due to combinatoric explosion. Proof-checking is, on the other hand, well established and there is zero need for any LLM tools.
Could a 'Math Genius' AI Co-author Proofs Within T (Score:3)

by locofungus ( 179280 ) writes: on Monday April 28, 2025 @03:40AM (#65336169)

Of course it could. The more interesting question is how it will be acknowledged, by mathematicians, by the AI interests, and by the media.
As with many other fields, computers offer the opportunity to search vast mathematical spaces that are beyond the ability of humans to investigate.
But the as yet unanswered question is how to tell a computer to look for something "interesting". In searches for drugs, for example, we might ask the computer to find a drug that will bind to a particular site or otherwise disrupt a metabolic pathway, but ask it to find *new* interesting ideas is beyond description, let alone execution.
This isn't particularly surprising, it's a vanishingly rare skill even amongst humans skilled in the field.
And so it will be with AI proofs. There will be great fanfare the first time an AI proves, with or without guidance from top mathematicians, some previously unsolved conjecture. "ChatGPT proves the Goldbach Conjecture! AI surpasses human intelligence!" but assuming that the Goldbach conjecture can be proved using already known mathematics then all the work is done, and all that remains is searching, something that AI is particularly good at. The search space is vast however, and it's just as likely that the only way an AI can find a proof is with the intuition of genius mathematicians stopping it going off down unproductive routes. The AI will find the proof but it may, or may not be something it could have done alone.
And, should it happen, it's quite likely that the AI proof will spur countless interesting discoveries by the mathematicians skilled in the field once they see the "trick" to solving that problem. Possibly half a dozen other unsolved problems might fall, not due to AI but due to *understanding*.
The second thing that will happen, but probably will slip most of us by because it won't make the mainstream press and we don't read mathematics journals, is some deep and profound new conjectures (with or without proof) that will be mostly the work of mathematicians but with a good dose of AI assist. This may, or may not, get credited to AI but will involve the sort of intelligent searching that AI can deliver on a scale that isn't possible by humans due to the limited number of them, their low speed relative to computers, and their propensity to make mistakes, particularly when doing boring grunt work.
The final thing that probably will happen eventually but doesn't appear to be on the horizon, is AIs coming up with new and profound conjectures absent the guidance of mathematicians. This is going to be hard to identify, not least because there's a vast difference between a computer coming up with 200 "conjectures", 198 of which are "completely wrong headed", 1 of which is already known and 1 of which the mathematicians looking at its output say "Ohhh, that's interesting" and an AI coming up with 3 conjectures, 2 of which are already known/proved and one is new, novel and interesting.

- Re: (Score:2)
  
  by evanh ( 627108 ) writes:
  
  Not! It's nothing more than a tool used by an author. Like reference text books. Citing the LLM's sources would be a good idea of course.
Of course. (?) (Score:2)

by Qbertino ( 265505 ) writes:

Math is strict rules, applied. Of course an AI could and very likely will co-author proofs and test them, if only by brute force. If you build a solid and well-contrained math AI and run it against more or less random math hypothesis you're likely to come up with new proofs and even theorems by mere chance. And pretty quickly too. A dedicated high-school kid could probably do this.
Having the AI describe the proof in human-readable form is basically a solved problem by now.
We're talking about AI here. The su
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Math is strict rules, applied. Of course an AI could and very likely will co-author proofs and test them, if only by brute force.
  Nonsense. Nothing you an publish is even remotely within reach of brute-force approaches.
  - Re: (Score:2)
    
    by flux ( 5274 ) writes:
    
    Good thing verifying formal proofs is basically mechanical work and the tools to do it automatically exist already.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      They do. But that has no connection to using "brute force" to _find_ proofs.
Forget it (Score:2)

by gweihir ( 88907 ) writes:

At least for the math. Proof-checking is not an LLM-task and well established tools exist. Proof generation is far, far outside of what LLMs can do except for toy examples. This area has been subject of intense research for 50 years and all approaches die early due to combinatoric explosion. LLMs will do _worse_ than what has been tried so far as they are even less able to target their next steps and are considerably slower when diving.
This is just another idiotic AI-hype story that is completely ignorant o
- Re: (Score:2)
  
  by flux ( 5274 ) writes:
  
  Why must they do worse?
  To me it would seem for generating proofs LLMs would be much better at guessing than simply blindly breadth-first-search would be, with any manually-written heuristic.
  Indeed I expect large parts of proofs would be applying existing patterns and LLMs would be good for applying structures they've been trained on. In particular generating the required tons of training material should be pretty straight-forward.
  Such a model could then be a proof strategy in Rocq (previously known as Coq).
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Why must they do worse?
    Because they are much more computationally intensive and take much longer to determine which next steps are possible. Say, a factor of 1000 or worse.
No (Score:2)

by drinkypoo ( 153816 ) writes:

Specifically because there is no genius AI. There is only sticking-stuff-together savant AI. Sometimes this reveals unexpected truths. Most times, it is expectedly trash.
The human needs to be the genius to get genius level work out of AI.
Well of course (Score:2)

by bob_jenkins ( 144606 ) writes:

We've been using computers to help with math since the invention of computers. For example the four color theorem from 1976 was famous for being too big for a human to verify manually. Coming up with a theorem involves spotting a likely truth, finding an explanation, and verifying the explanation is correct, and computers are used for all of those.
Probabilistic inference engines (Score:2)

by Mirnotoriety ( 10462951 ) writes:

ChatGPT: A core limitation of contemporary AI systems, especially large language models, lies in their reliance on probabilistic inference to generate output. Because these models are not grounded in external sources of truth and do not possess mechanisms for fact verification, they are unable to reliably differentiate between accurate information and fabricated, or 'hallucinated,' content.
It's not a co-author of the paper, it's a citation (Score:2)

by js_sebastian ( 946118 ) writes:

When you use a tool as part of your research that leads to a paper, it's good practice to include your tool in the citations for the paper. There's a gray area of course (you're probably not going to include a citation for latex and vim in every single paper you write with them), but if you use a specific piece of software written by other researchers in your field to do your calculations, you would generally include a citation.

Open source packages from academia often come with a request to cite a speci
Training is everything. (Score:2)

by LostMyBeaver ( 1226054 ) writes:

Feed a more or less blank model with millions or billions of math problems. Then check the answers and keep feeding the problems back to the model... But don't just keep trying the ones the get wrong. Then make an AI generate harder and harder problems and use tools like Wolfram to test the validity of the questions.

The more you feed the model, and refeed in randomized order the more accurate they become.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Hallucinations (Score:3, Interesting)

Re: (Score:3)

Re:Hallucinations (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: Hallucinations (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:3)

Re:Hallucinations (Score:5, Informative)

Not LLMs (Score:3)

Re: Not LLMs (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Hallucinations (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Hallucinations (Score:2)

Too bad Ted Kaczynski passed away in 2023 (Score:1)

Re: (Score:2)

Headline asks a question (Score:2)

Re: (Score:1)

what does he mean by radical? (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: what does he mean by radical? (Score:2)

Re: (Score:2)

Re: what does he mean by radical? (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: what does he mean by radical? (Score:2)

Re: (Score:2)

The title of the thread seems like nonsense to me. (Score:2)

This is all great and stuff, but. (Score:3)

Re: (Score:2)

Re: This is all great and stuff, but. (Score:2)

Not a chance. (Score:2)

Re: (Score:2)

Re: (Score:2)

1 is not a prime number (Score:2)

Re: (Score:2)

Co-author? (Score:2)

No, and. (Score:2)

Re: (Score:3)

Could a 'Math Genius' AI Co-author Proofs Within T (Score:3)

Re: (Score:2)

Of course. (?) (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Forget it (Score:2)

Re: (Score:2)

Re: (Score:2)

No (Score:2)

Well of course (Score:2)

Probabilistic inference engines (Score:2)

It's not a co-author of the paper, it's a citation (Score:2)

Training is everything. (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals