Machine Figures Out Rubik's Cube Without Human Assistance (technologyreview.com) 86
An anonymous reader quotes a report from MIT Technology Review: [Stephen McAleer and colleagues from the University of California, Irvine] have pioneered a new kind of deep-learning technique, called "autodidactic iteration," that can teach itself to solve a Rubik's Cube with no human assistance. The trick that McAleer and co have mastered is to find a way for the machine to create its own system of rewards. Here's how it works. Given an unsolved cube, the machine must decide whether a specific move is an improvement on the existing configuration. To do this, it must be able to evaluate the move. Autodidactic iteration does this by starting with the finished cube and working backwards to find a configuration that is similar to the proposed move. This process is not perfect, but deep learning helps the system figure out which moves are generally better than others. Having been trained, the network then uses a standard search tree to hunt for suggested moves for each configuration.
The result is an algorithm that performs remarkably well. "Our algorithm is able to solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves -- less than or equal to solvers that employ human domain knowledge," say McAleer and co. That's interesting because it has implications for a variety of other tasks that deep learning has struggled with, including puzzles like Sokoban, games like Montezuma's Revenge, and problems like prime number factorization. The paper on the algorithm -- called DeepCube -- is available on Arxiv.
The result is an algorithm that performs remarkably well. "Our algorithm is able to solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves -- less than or equal to solvers that employ human domain knowledge," say McAleer and co. That's interesting because it has implications for a variety of other tasks that deep learning has struggled with, including puzzles like Sokoban, games like Montezuma's Revenge, and problems like prime number factorization. The paper on the algorithm -- called DeepCube -- is available on Arxiv.
Re:Yawn (Score:5, Insightful)
it has implications for a variety of other tasks that deep learning has struggled with, including... problems like prime number factorization
If it could help with finding the prime factorization of large semi-prime numbers – ie two or more prime numbers that multiplied together result in a target original number - then that would be quite useful.
*cough* cryptography
Re:Yawn (Score:4, Funny)
Actually that's a great idea: knapsack problem (Score:3)
While Ikea furniture is designed with assembly in mind other things are not. Say for example, an airplane. So the assembly process might not be optimal. Letting the computer look for a more optimal process might be useful.
Or more practically, packing items into a shipping box. the famous knapsack problem.
I hate these slashdot summaries of algorithms. you end up thinking gosh that's stupid. When it's not. just the description is stupid. like a car analogy
Re: (Score:2)
While Ikea furniture is designed with assembly in mind other things are not. Say for example, an airplane.
I would assume that an airplane designer is considering assembly (and maintenance, which is even harder) in every part of the design.
Re: (Score:2)
If you can make the plane 5% lighter with twice the assembly time, they go for it.
In order to make the optimal choice, you have to know in advance how much assembly time is required for each of the various options.
Re: (Score:2)
Interesting name for a game (Score:1)
:D :D :D :D
Traveler's diarrhea - Wikipedia
https://en.wikipedia.org/wiki/Traveler%27s_diarrhea
"Montezuma's revenge (var. Moctezuma's revenge) is a colloquial term for traveler's diarrhea contracted in Mexico."
Wow amazing! (Score:2, Funny)
Re:Wow amazing! (Score:5, Insightful)
Games are easy for "AI" because games have strict rules that a modeler can account for/predict.
Re:Wow amazing! (Score:4, Insightful)
Games are easy for "AI" because games have strict rules
Just because the rules are strict (or even simple) does not mean that the game is easy. You can achieve arbitrary complexity by iterating the rules a large number of times. For example, the rules of Go are strict, the question whether a given board position is winning for white is hard. The rules of a programming language are strict. Writing a Linux kernel is hard. The rules of math are strict. Providing a proof for Fermat's last theorem is hard. The rules of physics and soccer are strict. Making a robot that can beat a human at the game is hard.
Re: (Score:2)
it's still far from AI though.
making a robot that runs around after a ball and would fullfill the rules of soccer is indeed hard.
also in this case, the "AI" was shown the wanted end result. it's all very neat except the headline says it solved it by it's own, which would have been a feat if it figured out semi randomly on it's own that this is how it's supposed to be presented for someone to be impressed. after that it's just trial and error loop so remind me again how is this ai?
if you had _Actual_ friggin
Re: (Score:2)
after that it's just trial and error loop so remind me again how is this ai?
It's not just trial and error. The Rubik's cube has 10 to the power of 19 combinations, and most of them look like fully scrambled cubes. You cannot randomly try things until you stumble on the solution. The AI part is where it learns the patterns that tell it that it's making progress. Most humans who have come up with a solution to the Rubik's cube start by first solving one side, and then the second layer, and then the top. This AI system has done something similar, except it doesn't work in layers, but
Re: (Score:2)
Exactly.
Re: (Score:2)
Dominance. Control. Accumulation of wealth. Fighting over resources. etc ....
Heuristics optimization. It has worked well for us so far.
Humanity would be best served by AI overlords.
And yet in practically every book or movie involving the 'benign' transfer of social control over to benevolent overlords, the humans end up unhappy. Even if not explicitly stated, such control is generally perceived as 'evil'.
Re: (Score:2)
Hehehehe, nice. No, not actually AI, just a planning algorithm as being used and researched for something like > 50 years now. This is another instance of machines getting faster, not of them getting any smarter. On the face of it, the Rubic's cube is a very simple problem with a very simple description and a low number of states. Sure, the number of states is actually pretty large when seen absolutely, but for a planning problem, it is not that large and, in particular, the score for a state is downrig
Re: (Score:2)
Hehehehe, nice. No, not actually AI, just a planning algorithm as being used and researched for something like > 50 years now. This is another instance of machines getting faster, not of them getting any smarter. On the face of it, the Rubic's cube is a very simple problem with a very simple description and a low number of states. Sure, the number of states is actually pretty large when seen absolutely, but for a planning problem, it is not that large and, in particular, the score for a state is downright simplistic: The number of moves to it being solved.
Real-world planning problems are nowhere near that simple. Hence while we will continue to see these stunts, we will not see any real-world problems solved this way for a long, long time, if ever. Also take into account that single-core CPU speed scaling is dead and that most planning algorithms are, in the end, strongly constrained by single-core speeds.
That's not all that off base but its not quite correct either. You are just missing some context. So this appears to be using a part of AI called Reinforcement Learning (I did my thesis on RL at CMU 20 years ago). So in RL, there are a bunch of techniques for taking a pre-defined domain with a reward function that shapes the behavior learned by the system. There are plenty of algorithms (Value Iteration, Polity Iteration, Q-Learning, etc) for solving a problem with an existing reward function and a disc
With one exception; the goal state (Score:2, Insightful)
Someone had to tell it what is a solution. If you give it a solved cube, that's assistance. Is it really that hard not to inflate headlines?
Re:With one exception; the goal state (Score:5, Funny)
If you give it a solved cube
And you give it a scrambled cube. The AI shouts "Hey look! Haley's comet!" And while you are looking up, it switches them.
Turing test: Passed.
Re: (Score:2)
I did wonder how it decided for itself that the completed cube was somehow "better".
Re: (Score:2)
This is what I was thinking of as well (putting my 37 year old copy of The Handbook of Artificial Intelligence back on the shelf).
Re: (Score:2)
This is what I was thinking of as well (putting my 37 year old copy of The Handbook of Artificial Intelligence back on the shelf).
I think you mean Norvig Russell...here is a free PDF Artificial Intelligence: A Modern Approach [lagout.org]
Re: (Score:2)
You couldn't find one older?
Odd definition of "without human help" (Score:5, Insightful)
This algorithm was able to figure out how to solve Rubik's Cube with no help from humans other than humans providing the (simulated) cubes, describing what the solution looks like, and designing an algorithm specific to solving Rubik's Cube?
Color me less than impressed.
Re: (Score:2)
What do you mean? White, red, blue, yellow, orange or green?
Re: (Score:3)
Oh, it still is a nice result. But you are describing exactly the core problem with it: Everything was clear and described in simple, clear statements from the start. That is not how a real-world problem presents itself.
Re: (Score:2)
Everything was clear and described in simple, clear statements from the start. That is not how a real-world problem presents itself.
That's exactly how the problem is described on the box when you buy a Rubik's cube.
Re: (Score:2)
So?
Re: (Score:2)
If you are talking about an algorithm to get from the start to end, it could be interesting. However, this is what I don't see useful is that solution of Rubik is simple a pattern from one to the other. Thus, you need to run the algorithm only once to create links to each of pattern to the solution state. Even though there are 6 faces, it is still possible to store them all. Why? Because that's how those Rubik geniuses do -- memorize certain patterns from one to the other. A computer should be able to do it
Re: (Score:3)
The same way rich people "learn" how to become rich.
GOAP? (Score:1)
Sounds like GOAP - goal orientated action planning. You start at the end state then perform actions (in reverse) until you get to the current state. I read the article which doesn't tell you much more about how they did it. It sounds like they brute forced a bunch of moves to build up a tree then used A* on the tree. Then they trained a neural net on the brute-forced solutions. They talk about evaluating how close a cube state is to the goal state, but they don't explain how the AI determined that. It
Re: (Score:2)
Sounds like GOAP - goal orientated action planning. You start at the end state then perform actions (in reverse) until you get to the current state. I read the article which doesn't tell you much more about how they did it. It sounds like they brute forced a bunch of moves to build up a tree then used A* on the tree. Then they trained a neural net on the brute-forced solutions. They talk about evaluating how close a cube state is to the goal state, but they don't explain how the AI determined that. It sounds like they hard coded what closeness means, so they're lying when they say the AI worked without human assistance. Without human assistance means they would have needed to use a GA, self-playing, or some other learning method to determine what closeness means. The article's "without human assistance" refers to them not hard-coding any move sequences. I consider that a very far stretch of the word. I'd call hard-coding moves as cheating. If you give it everything it needs, it turns an AI into an algorithm in my mind.
Here's a link to the paper from the article. I don't have the time to read it right now: https://arxiv.org/pdf/1805.074... [arxiv.org]
Search is the basic AI problem and thus part of AI. And no, they are not using GOAP here (which is just a type of search). It looks a lot more like Value Iteration which is a type of RL (Reinforcement Learning). The point of this research is that it learns to not blindly try random things but learns to search in a more intelligent fashion. But unfortunately for them, their technique was invented decades ago and so just slapping a new name on it doesn't really impress anyone.
It just created a look-up table. (Score:2)
Re: (Score:2)
The number of possible cube configurations is too large to make a lookup table practical.
Long lost (Score:2)
Brute force method... (Score:1)
Re: (Score:2)
Overall, when starting from nothing, this _is_ the fastest way. I figured this out as a teenager. But is also is a way these "AI" things cannot come up with, because they do not actually have any general intelligence. They cannot think "outside of the box" at all. Still lots of applications for this, but replacing a smart person is not among them.
Re: (Score:2)
I am going to assume you really are APK.
I very, very rarely post as AC and I never sign it when I do. I also never do it to rile anybody up. I most definitely did not write that posing.
There are some deficients here that cannot stand that I can understand things they are incapable of understanding and that I dare to tell them they are stupid when they have written (again) something extremely stupid. Cowardly, dishonorable and utterly pathetic trolling.
Re: (Score:2)
I have had them from time to time, but apparently I am not interesting enough to rate permanent stalkers. Now I will make damned sure to not ever do any AC postings that resemble this crap, you have my word. Not that I ever intended to do anything like this, it is just completely dishonorable and to me, that counts for something. People sniping from the dark are destroyers of communities and have not place in civilized society.
So if it is AC and claims to be from me, it is not.
Games??!! (Score:2)
If ever you've traveled, you know that Montezuma's Revenge is no game!
Something's not adding up (Score:3)
Either the article writer didn't understand the whitepaper, or the researchers haven't actually done anything novel.
This works because the beginning state and end state of a Rubik's Cube are effectively identical. It's the same number of tiles, in a specific arrangement. As humans, we've defined the "solved" state to be all the tiles color-matched to a side. But the "solved" state could just as arbitrarily be any pattern or arrangement of colors across the cube.
Reversing the simulation to work backwards from the "solved" to some specific state of scrambled is exactly the same problem as starting from some specific state of scrambled and trying to get to the solved.
Re: (Score:2)
The summary and article are quite confusing. They started with solved cubes, and scrambled them in different amounts to generate training data so they could train a neural net to give the most likely candidate turns that would solve the cube.
After the network is trained, you can give it a random scrambled cube, and it will do a Monte-Carlo tree search based on the guidance of the neural net.
But we already know a shortcut to solve it (Score:1)
HS Trig proofs (Score:1)