Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
AI Science

Hilarious (and Terrifying?) Ways Algorithms Have Outsmarted Their Creators (popularmechanics.com) 75

"Robot brains will challenge the fundamental assumptions of how we humans do things," argues Popular Mechanics, noting that age-old truism "that computers will always do literally, exactly what you tell them to." A paper recently published to ArXiv highlights just a handful of incredible and slightly terrifying ways that algorithms think... An AI project which pit programs against each other in games of five-in-a-row Tic-Tac-Toe on an infinitely expansive board surfaced the extremely successful method of requesting moves involving extremely long memory addresses which would crash the opponent's computer and award a win by default...

These amusing stories also reflect the potential for evolutionary algorithms or neural networks to stumble upon solutions to problems that are outside-the-box in dangerous ways. They're a funnier version of the classic AI nightmare where computers tasked with creating peace on Earth decide the most efficient solution is to exterminate the human race. The solution, the paper suggests, is not fear but careful experimentation.

The paper (available as a free download) contains 27 anecdotes, which its authors describe as a "crowd-sourced product of researchers in the fields of artificial life and evolutionary computation. Popular Science adds that "the most amusing examples are clearly ones where algorithms abused bugs in their simulations -- essentially glitches in the Matrix that gave them superpowers."
This discussion has been archived. No new comments can be posted.

Hilarious (and Terrifying?) Ways Algorithms Have Outsmarted Their Creators

Comments Filter:
  • Stupid local minima (Score:5, Interesting)

    by locater16 ( 2326718 ) on Sunday March 25, 2018 @03:07AM (#56321939)
    These aren't really that terrifying. We just don't have the GPU power for re-enforcement learning like this to search for really out there solutions to problems at the moment. But they can produce really funny stories like this.

    My favorite story is of a bot given the task of moving itself through a maze or somesuch (important part incoming). Anyway, the programmer decided the more time the bot spent away from the center of the maze the worse points it would get (it's trying to optimize for points here). But instead of going towards the center of the maze as fast as possible to maximize points it just couldn't figure out how to get through. So it sent itself off the virtual edge of the simulation area, ending the run and minimizing it's negative score as best as possible. By accident someone created a suicidal bot, yay!

    And that is really the extend of "Deep Re-enforcement Learning" aka AI that teaches itself to do things today. Sometimes, like with Alpha Go, it works. But a lot of the time it does something stupid.
    • by uvatbc ( 748669 ) on Sunday March 25, 2018 @04:05AM (#56322041)

      But a lot of the time it does something stupid.

      Much like evolution: The algorithms that survive are useful.

      • by Anonymous Coward

        Much like evolution: The algorithms that survive are useful.

        No, much like evolution the algorithms that survive do so by existing in a specific niche that when presented with a different environment probably would result in a catastrophic die-off of that algorithm. "Useful" isn't a part of evolution unless you think mere survival is useful.

    • by NoZart ( 961808 ) on Sunday March 25, 2018 @05:05AM (#56322133)

      My favorite is the tetris bot that just presses pause before he loses

    • by Pembers ( 250842 )

      I wonder if the bot had learned to exploit integer overflow? On many CPUs, if you subtract one from the most negative integer that the machine can store (the one furthest from zero), you get the most positive integer (the one furthest from zero in the other direction). If the bot ran away from the centre of the maze, its score would decrease rapidly, but then would flip from a big negative number to a big positive one.

      • by mikael ( 484 )

        I know there were some early PC flight simulators that stored the rotation of the aircraft in 16-bit fixed point. Spin fast enough clockwise and your aircraft would end up spinning anti-clockwise at the fastest rate.

      • The paper linked is actually pretty interesting and gives some cool examples, including a floating point overflow that managed to zero out any penalties the AI would receive.

        Most interesting is the Tic Tac Toe AI that one by causing OOM errors in it's opponent. All opponents had dynamically expanding boards held in memory so the AI which was encoding moves rather than boards would just fire off a move in the billions of X,Y coordinates and crash its opponent.

        I'm a huge fan of evolutionary algorithms - rea
        • by Pembers ( 250842 )

          I did read the paper (shocking, I know ;-) ). I agree it's well worth a read. I like the way that an evolutionary algorithm can expose requirements and domain knowledge that a human expert would consider too obvious to need to be stated. For instance, the one that was looking for novel ways to arrange carbon atoms into buckyballs that wanted to put all the atoms in the same place. Or the one that was trying to evolve a sorting algorithm and settled on a function that always output an empty list, because nob

      • On at least some versions of Sid Meier's Civilization, Gandhi was given an aggression value of 1. Becoming a Republic drops aggression by 1, becoming a Democracy drops it by 2. Guess what India was like as a democracy.

    • by Kjella ( 173770 ) on Sunday March 25, 2018 @09:26AM (#56322733) Homepage

      So it sent itself off the virtual edge of the simulation area, ending the run and minimizing it's negative score as best as possible. By accident someone created a suicidal bot, yay! (...) But a lot of the time it does something stupid.

      Who did something "stupid"? The bot achieved its goal, but the programmed goal completely failed to achieve the intended goal. This is basically "The code did what I said, not what I meant" taken to a new level. The problem is that you can't easily inspect a neural network's logic in human terms the way you trace through code, it's more like another person. I think this is a cat, you think this is cat, the AI thinks this is a cat but we can't exactly quantify exactly what makes this a cat or non-cat which means the model can break down unexpectedly in ways you can't possibly predict, like you show it a one-eyed cat and suddenly the AI thinks it's a cyclops. And that's going to be a problem as we start relying on AI, like this self driving car thinks you're a pedestrian until one day for some inexplicable reason you don't qualify.

      • When I was working on a learning algorithm to perform a certain task, I had it ground into me that the evaluation function and restrictions on valid solutions are very, very important.

    • I economist could have told you this would happen. Economists understand incentives. If you incetivize the wrong thing, you will get a bad outcome.
  • by petes_PoV ( 912422 ) on Sunday March 25, 2018 @03:09AM (#56321943)
    ... is already half-answered

    And most of the situations described in the reference article describe poorly framed problems. I understand that it is supposed to be a jokey, light, non-serious, read. However it illustrates the problem with people asking the wrong question, or making incorrect assumptions.

    Many years ago the multi-billion $$$$ utility company I was working for had a team from [ name removed to protect the stupid ] a well-known consultancy outfit. One of their conclusions was that some of our servers were running with too much idle time - under utilised in their opinion. All they had done was collect %idle data from sar (Unix systems from Sun, IBM and HP). and their junior idiot looked at that and decided it was a "problem"

    When I was asked about this by the CIO and the "consultants", my response was that I could easily increase the utilitisation figure to whatever the CIO desired, or that the consultants recommended - how high would he like it to be? Since he knew me, and saw the smile, he saw the trap. I explained that "idle" time and user response time were tightly linked: that reducing one would increase the other. This was news to the "consultants" once I explained the maths and Queuing Theory behind it.

    • by tlhIngan ( 30335 ) <slashdot&worf,net> on Sunday March 25, 2018 @03:37AM (#56322009)

      ... is already half-answered

      And most of the situations described in the reference article describe poorly framed problems. I understand that it is supposed to be a jokey, light, non-serious, read. However it illustrates the problem with people asking the wrong question, or making incorrect assumptions.

        Many years ago the multi-billion $$$$ utility company I was working for had a team from [ name removed to protect the stupid ] a well-known consultancy outfit. One of their conclusions was that some of our servers were running with too much idle time - under utilised in their opinion. All they had done was collect %idle data from sar (Unix systems from Sun, IBM and HP). and their junior idiot looked at that and decided it was a "problem"

      When I was asked about this by the CIO and the "consultants", my response was that I could easily increase the utilitisation figure to whatever the CIO desired, or that the consultants recommended - how high would he like it to be? Since he knew me, and saw the smile, he saw the trap. I explained that "idle" time and user response time were tightly linked: that reducing one would increase the other. This was news to the "consultants" once I explained the maths and Queuing Theory behind it.

      Or more like AI simply did the real human thing and figured out the weakness in the measurement system in use and exploited it.

      In other words, the AI simply did what a human would eventually figure out and do - cheat the system.

      All the examples in there are basically how the AI figured out a way of cheating the calculations, something humans would figure out as well.

      And the reason we have to cheat is often the "measurement" item cannot be measured. One popular goal setting thing in use is "SMART" (specific, measurable, achievable, realistic, time-bound), but there are a lot of things that can translate into that easily. For example, productivity. Since time immemorial, people have wanted a way to measure programmer productivity, and the most obvious measurement was well, lines of code. Which did nothing but bloat the codebase up with needless lines of code. Then people tried bug counts ("I'm going to write myself a new Ferrari"' from Dilbert). And to this end, there's no way to measure "productivity" than by a proxy measure (proxy measure is something me can measure that hopefully relates to the actual quantity we wish we could measure directly), we implement those measurements. But then people find shortcuts - ways to increase the thing the proxy measures, but without increasing actual expended effort.

      Take another example - say my goal is to make my blog more popular. Well, how do I measure popularity? Visitors per month? Comments per month? A little sensational click-bait bit of fake news will boost both numbers easily enough. But did I accomplish the goal, or did I simply game the system?

      All AI has done is exposed these limitations in our proxy measurements and simply exploited them. In short, AI simply figured out the limitations of the system and exploited them.

      • by ceoyoyo ( 59147 )

        There's a joke in Eve Online that however careful the developers are, the players will very quickly figure out how to break any new game mechanic or balancing.

        When you start asking algorithms to learn their own solutions, you frequently get solutions that exploit bugs in your simulation. Just like if you give a flawed game to a bunch of people.

        • by mikael ( 484 )

          That was like playing Archon. Using the right moves, you could win the game within five or six moves rather than slogging it out. Grab the middle power squares at the edge, wait until one cycle and then grab the square at the far end.

          Playing Command and Conquer, there was one level where the enemy had airplane factories on top of cliffs at the top of the screen. Inbetween there were whole battalions of tanks and rocket launchers. The quick way to win was to get somes construction droids, build a wall, then

    • Is this the "they're running 95% idle 95% of the time so that the other 5% of the time they're able to run at all" principle or am I missing something more subtle?

      • The overriding design requirement was to meet user response time targets. This was for numbers of users in the thousands with well defined peak hour periods, both daily and weekly. Performance was measured in terms of user perceived "happiness" - responsiveness.

        From an I.T. perspective, the response time was the sum of all the wait time (network latency, queuing time) and the processing time. This approximated a m/m/1 model.

        HTH

    • It is true that the first section of the paper describes poorly framed problems but that's not all that is there. The Exceeding Expectations section of the paper has a whole long list of real solutions that systems found which were better solutions than the known real world solutions. They were exploiting unexpected aspects of actual physics or other parts of a setup.
    • a team from [ name removed to protect the stupid ] a well-known consultancy outfit

      If the consultancy was really, really expensive, the name must have started with a A ...

    • And most of the situations described in the reference article describe poorly framed problems. I understand that it is supposed to be a jokey, light, non-serious, read. However it illustrates the problem with people asking the wrong question, or making incorrect assumptions.

      I don't know if that's an entirely fair assessment. Part of the issue here is that it's not easy (perhaps impossible) to frame problems well so that there's no possibility of finding an unintended solution. For any of the the human problems that we want solutions to, there may be some mathematically viable solution that violates other practical considerations. It may "solve the problem" without actually solving the problem, and in fact making things worse.

      And I don't think that idea is unimportant. It'

    • by Pembers ( 250842 ) on Sunday March 25, 2018 @09:40AM (#56322771) Homepage

      That reminds me of an anecdote that one of my university lecturers told, about one of the first computers with programmable microcode. Someone ran a profiler on it and noticed that it was spending a lot of time executing a particular sequence of four machine language instructions. They decided to create a new instruction that would do the same thing as this sequence, but would be faster and need less memory.

      So they did this, and modified the compiler so that it knew about the new instruction, and recompiled all the software that ran on the machine... and it was no faster than before.

      That four-instruction sequence? It was the operating system's idle loop.

    • by tomhath ( 637240 ) on Sunday March 25, 2018 @09:56AM (#56322833)
      A friend of a friend got a part-time job loading coin-op candy machines. Rather than being paid by the hour, he was paid by the number of machines on his route; working fast or slow didn't matter. It didn't take him long to realize that the popular candy bars were the first to go and took the most time to restock. But one brand, the "Zero Bar" was distinctly unpopular. Before long, he had filled all the machines with Zero Bars and was able to keep the machines full with virtually no effort.
      • by mikael ( 484 )

        There was one high school teacher who would reward the student with the highest improvement in grades with a candy bar. One kid figured out that he just needed to fail the weekly test one week and get a high grade the next week.

  • by mentil ( 1748130 ) on Sunday March 25, 2018 @03:12AM (#56321949)

    If an evolutionary algorithm is pitted against real life, and 'outsmarts' it, that's one measure of evolutionary progress. The real issue is the same as in 'teaching to the test', or even the 'kobayashi maru solution': the metrics are gamed once the one being tested realizes what they are, and then the metrics no longer hold meaning [wikipedia.org].
    Replace 'metrics' with 'simulation parameters' and it's the same thing. The simulation has to be as intelligent as the uncontrolled agents operating inside of it, or else these types of things will happen. Self-modifying simulations perhaps?

  • "The solution, the paper suggests, is not fear but careful experimentation."

    Do you know what's a complete anathema to that ethos..... the drive of money. If we don't want to go down that road, then rampant greed needs to be dealt with first otherwise it's going to be a race to see who's first to get something exciting to market.

  • by Anonymous Coward

    We just need to know how to ask them to do what we really want.
    If the simulations are inaccurate representations of the problems we want to solve, the answers given by the AI will be inaccurate.
    Hitchhiker's Guide to the Galaxy already touched on this problem.
    If you don't understand the question, the answer will be meaningless.

    • It's often less that the simulations are inaccurate examples of reality than the few metrics that are optimized for are not representative of reality. The deep learning alrorithms have no idea what is going on, but simply make numerous instances of small changes to its algorithms and then rates which is better using that better score(s) . It then uses the better performing algorithm and repeats the process. Though as you said, the computer is still doing exactly what you ask - it's just the how is supris
    • by mikael ( 484 )

      It's like that researcher who tried to design an electronic circuit to differentiate between two different frequencies. The idea was that there would be two electronic circuit oscillator that would resonate when the input matched their frequency. Using genetic algorithms, different electronic circuits would evolve. They came to a point where some wires were missing but the circuit still worked. The circuit took advantage of the electromagnetic interference simulation component and avoided the need for wires

  • "the only winning move is not to play."

    Fiction, I know, but not an unreasonable conclusion in that 'game.'

  • by Ed Tice ( 3732157 ) on Sunday March 25, 2018 @07:28AM (#56322359)
    A human player, if presented with this, would ask "what if it doesn't work?" If I try a trick and it fails (other system doesn't crash), now I'm in a much worse place than if I had just made a reasonable move. Unless the situation is desperately hopeless, the intelligent player wouldn't even try. This is a basic problem with any "hill climbing" algorithm.
  • in games of five-in-a-row Tic-Tac-Toe on an infinitely expansive board surfaced the extremely successful method of requesting moves involving extremely long memory addresses which would crash the opponent's computer and award a win by default

    Finally, we can automate politicians!

    • by mikael ( 484 )

      They just kick possible but unpopular solutions into the far distant future as possible. By then either the other party is in power and has to deal with the problem, or the problem has resolved itself.

  • The first 2 authors are from "Uber AI labs"...
  • The very word "outsmarted" has a cliche factor of 98.9%

    Pretty much the whole shit-show is better covered by the old, not-the-least-bit-tired adage "everything that can go wrong, will go wrong".

    The other 1.1% of "outsmarted" is brilliantly covered by Foghorn Leghorn Teaching the Art of Baseball [youtube.com].

  • The very word "outsmarted" has a cliche factor of 98.9%

    Pretty much the whole shit-show is better covered by the old, not-the-least-bit-tired adage "everything that can go wrong, will go wrong".

    The other 1.1% of "outsmarted" is brilliantly covered by Foghorn Leghorn Teaching the Art of Baseball [youtube.com].

  • A few years ago there was a contest to see which robot could shoot the most balls into a target in a given amount of time.

    The winning robot simply covered up its opponents target, then leisurely put one shot into its own.

    They were forced to give the prize to the "cheating" robot since it had fulfilled the letter of the victory condition.

    I did a quick search and couldn't find a reference to the article, sorry!

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...