Google DeepMind's New AI System Can Solve Complex Geometry Problems (technologyreview.com) 10
An anonymous reader quotes a report from MIT Technology Review: Google DeepMind has created an AI system that can solve complex geometry problems. It's a significant step towards machines with more human-like reasoning skills, experts say. Geometry, and mathematics more broadly, have challenged AI researchers for some time. Compared with text-based AI models, there is significantly less training data for mathematics because it is symbol driven and domain specific, says Thang Wang, a coauthor of the research, which is published in Nature today. Solving mathematics problems requires logical reasoning, something that most current AI models aren't great at. This demand for reasoning is why mathematics serves as an important benchmark to gauge progress in AI intelligence, says Wang.
DeepMind's program, named AlphaGeometry, combines a language model with a type of AI called a symbolic engine, which uses symbols and logical rules to make deductions. Language models excel at recognizing patterns and predicting subsequent steps in a process. However, their reasoning lacks the rigor required for mathematical problem-solving. The symbolic engine, on the other hand, is based purely on formal logic and strict rules, which allows it to guide the language model toward rational decisions. These two approaches, responsible for creative thinking and logical reasoning respectively, work together to solve difficult mathematical problems. This closely mimics how humans work through geometry problems, combining their existing understanding with explorative experimentation.
DeepMind says it tested AlphaGeometry on 30 geometry problems at the same level of difficulty found at the International Mathematical Olympiad, a competition for top high school mathematics students. It completed 25 within the time limit. The previous state-of-the-art system, developed by the Chinese mathematician Wen-Tsun Wu in 1978, completed only 10. "This is a really impressive result," says Floris van Doorn, a mathematics professor at the University of Bonn, who was not involved in the research. "I expected this to still be multiple years away." DeepMind says this system demonstrates AI's ability to reason and discover new mathematical knowledge. "This is another example that reinforces how AI can help us advance science and better understand the underlying processes that determine how the world works," said Quoc V. Le, a scientist at Google DeepMind and one of the authors of the research, at a press conference.
DeepMind's program, named AlphaGeometry, combines a language model with a type of AI called a symbolic engine, which uses symbols and logical rules to make deductions. Language models excel at recognizing patterns and predicting subsequent steps in a process. However, their reasoning lacks the rigor required for mathematical problem-solving. The symbolic engine, on the other hand, is based purely on formal logic and strict rules, which allows it to guide the language model toward rational decisions. These two approaches, responsible for creative thinking and logical reasoning respectively, work together to solve difficult mathematical problems. This closely mimics how humans work through geometry problems, combining their existing understanding with explorative experimentation.
DeepMind says it tested AlphaGeometry on 30 geometry problems at the same level of difficulty found at the International Mathematical Olympiad, a competition for top high school mathematics students. It completed 25 within the time limit. The previous state-of-the-art system, developed by the Chinese mathematician Wen-Tsun Wu in 1978, completed only 10. "This is a really impressive result," says Floris van Doorn, a mathematics professor at the University of Bonn, who was not involved in the research. "I expected this to still be multiple years away." DeepMind says this system demonstrates AI's ability to reason and discover new mathematical knowledge. "This is another example that reinforces how AI can help us advance science and better understand the underlying processes that determine how the world works," said Quoc V. Le, a scientist at Google DeepMind and one of the authors of the research, at a press conference.
Is this new work, or a mashup with Wolfram? (Score:2)
Wolfram Alpha https://www.wolframalpha.com/ [wolframalpha.com] has specialized math reasoning (descended from Mathematica) for decades. A mashup is exactly what Stephan Wolfram wrote about some time back and is the obvious “next step” up from “simple” language models. If this is a mashup, it represents good progress but seems like it should be made explicit even in a high level article such as this one.
If it isn’t, why is “alpha” in the name?
Re: (Score:2)
Re: (Score:2, Insightful)
If it isn’t, why is “alpha” in the name?
I don't think Stephan Wolfram has ownership over the first letter of the Greek alphabet, though no doubt he'd like to take credit for putting Alpha in a product name, and if he wasn't the first, well he definitely thought of it decades before WolframAlpha was released to the public; and if that didn't happen, I'm sure he has a Greek ancestor.
Flawed definition of intelligence (Score:2)
Re: (Score:1)
Right, because when people think of AI they think of average human intelligence.
"Today we'd like to announce the dawn of a new sentient AI with average human intelligence."
"OMG, what's it doing? Curing cancer? Developing new materials? Extending our vision of the universe?"
"Well, no. Currently its drunk, watching TikTok, looking up pictures of computers with their cases open, and binging The Bachelor.. But we're sure we can get it motivated to do all that tomorrow. Got to go, the AI just discovered QA
OK, Let's see it trisect an angle (Score:2)
Why not tie a logic checker to the head of the AI? (Score:2)
Solving mathematics problems requires logical reasoning, something that most current AI models aren't great at.
Why not ask the AI to output the reasoning in formal logic terms, feed this output to some standard deterministic logic-software to check for correct syntax and validity, and reject or perhaps highlight the faulty LLM-generated "logic" before the AI output is finally returned?
The correct AI answer would be: "Sorry I did my best, here is my idea, and here is where the argument fails. I did try 14,000,605 times, and this is my least wrong attempt.