Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Math

Computer Scientists Break Traveling Salesperson Record (quantamagazine.org) 72

After 44 years, there's finally a better way to find approximate solutions to the notoriously difficult traveling salesperson problem. From a report: When Nathan Klein started graduate school two years ago, his advisers proposed a modest plan: to work together on one of the most famous, long-standing problems in theoretical computer science. Even if they didn't manage to solve it, they figured, Klein would learn a lot in the process. He went along with the idea. "I didn't know to be intimidated," he said. "I was just a first-year grad student -- I don't know what's going on." Now, in a paper posted online in July, Klein and his advisers at the University of Washington, Anna Karlin and Shayan Oveis Gharan, have finally achieved a goal computer scientists have pursued for nearly half a century: a better way to find approximate solutions to the traveling salesperson problem. This optimization problem, which seeks the shortest (or least expensive) round trip through a collection of cities, has applications ranging from DNA sequencing to ride-sharing logistics. Over the decades, it has inspired many of the most fundamental advances in computer science, helping to illuminate the power of techniques such as linear programming. But researchers have yet to fully explore its possibilities -- and not for want of trying. The traveling salesperson problem "isn't a problem, it's an addiction," as Christos Papadimitriou, a leading expert in computational complexity, is fond of saying.

Most computer scientists believe that there is no algorithm that can efficiently find the best solutions for all possible combinations of cities. But in 1976, Nicos Christofides came up with an algorithm that efficiently finds approximate solutions -- round trips that are at most 50% longer than the best round trip. At the time, computer scientists expected that someone would soon improve on Christofides' simple algorithm and come closer to the true solution. But the anticipated progress did not arrive. "A lot of people spent countless hours trying to improve this result," said Amin Saberi of Stanford University. Now Karlin, Klein and Oveis Gharan have proved that an algorithm devised a decade ago beats Christofides' 50% factor, though they were only able to subtract 0.2 billionth of a trillionth of a trillionth of a percent. Yet this minuscule improvement breaks through both a theoretical logjam and a psychological one. Researchers hope that it will open the floodgates to further improvements.

This discussion has been archived. No new comments can be posted.

Computer Scientists Break Traveling Salesperson Record

Comments Filter:
  • by fermion ( 181285 ) on Thursday October 08, 2020 @04:32PM (#60586466) Homepage Journal
    This is a generalization of the seven bridges problem. It is a crucial part of graph theory and topology. So the problem is one of the bases of mathematics and everything. Like Google search algorithm. It has been around since at least the mid 18th century when Euler tackled it.
    • I don't understand with the power of modern computers today (and quantum computers in the near future), why not brute-force the solution by trying each possible route, and computing the distance (cost)? Seems like a job specifically engineered for modern CPUs.
      • Re:Seven bridges (Score:5, Informative)

        by Entrope ( 68843 ) on Thursday October 08, 2020 @04:54PM (#60586528) Homepage

        The brute-force search space grows geometrically as you add points. It is easy to solve three- or four-point TSP instances by hand. A computer can easily solve 20-point TSP problems. However, 100-point TSP problems involve considering almost 1e158 sequences of the 100 points. That is impractical to brute-force with today's computers.

        • Wouldn't it be possible to cache the distance/time for combinations of route segments, speeding future calculations that use that same combination of segments?

          • by Entrope ( 68843 )

            You can save some recomputation by doing that, although storage space grows very quickly (ten-point segments would need 100*99*98*...*91 storage cells). Depending on cache effects, it may be better to recalculate the distances from native coordinates or a 100-by-99 table of single-hop distances.

            It's also hard to prune the search space much that way; naively, you would still have to check all 10-to-the-158th paths, but each path could be some constant factor faster. At best, you can exclude some prefixes b

      • Re:Seven bridges (Score:5, Informative)

        by ShanghaiBill ( 739463 ) on Thursday October 08, 2020 @04:57PM (#60586540)

        why not brute-force the solution by trying each possible route

        The number of possible routes is the factorial of the number of cities.

        So if your brute-force algorithm can solve TSP for 20 cities, you need a computer 21 times as powerful to solve for 21 cities, and a computer 462 times as powerful to solve for 22 cities.

        There are heuristics for pruning the search tree (for instance, no path should ever cross another). But you are still facing exponential time complexity.

        • Important with, say, DNA research where you're trying to reassemble or search a jigsaw with 30 million pieces, especially for ancient DNA where some of the pieces are missing and you still have to get the pieces in the right place.

          I'm sure that's not even remotely the worst problem.

      • by jrumney ( 197329 )

        Brute forcing is not a scalable algorithm.You want an algorithm that for an arbitrary number of "locations" tells you an order to visit them that will save more time than it takes to work it out. For a typical salesman, their weekly or even monthly schedule can probably be brute forced reasonably quickly, but the DNA sequencing issue (I'm not familiar with it, but it sounds like an interesting application) is probably beyond that.

      • by raymorris ( 2726007 ) on Thursday October 08, 2020 @05:04PM (#60586558) Journal

        If checking for 10 nodes takes 1 second, checking for 11 nodes will take 10 seconds, 12 nodes 110 seconds, 13 nodes 1320 seconds, 14 nodes 17,160 seconds. Each time you ADD 1 node, you MULTIPLY the time by the number of existing nodes.

        So the time to check 100 nodes is a 156-digit number.

      • by andot ( 714926 )
        Short answer. No. You can't. https://en.wikipedia.org/wiki/... [wikipedia.org]
      • Re:Seven bridges (Score:5, Informative)

        by toutankh ( 1544253 ) on Friday October 09, 2020 @06:55AM (#60587834)

        I don't understand with the power of modern computers today (and quantum computers in the near future), why not brute-force the solution by trying each possible route, and computing the distance (cost)? Seems like a job specifically engineered for modern CPUs.

        Hi, I'm a researcher working on related problems. As it has been mentioned in other comments, the number of solutions for n cities is in the order of n!. If you fix one city as starting point (which you can do, it is called symmetry-breaking as it allows you to discard many equivalent solutions) then the number of solutions is (n-1)! and if distances are symmetric then you can divide this number by two.

        A few years ago I wrote such an enumeration program in Python to show my students that enumeration is a bad idea. This program could enumerate about 2 million solutions per seconds. For 100 cities, we need to enumerate 100! solutions, which would take about 1.48e142 years, which is 132 orders of magnitude larger than the age of the universe. Of course you could write this in C++ instead, it would likely be 100 times faster, so it would only take 1.48e140 years. You could also write a better approximation algorithm that goes 100 times faster, and take a computer that is a million times faster, and parallelise this to a million such cores, and then it would only take 1.48e126 years :)

        So enumeration is not the way to go. There are smarter ideas to solve the TSP exactly, the most efficient TSP solver that is publicly know is based on linear programming techniques: https://www.math.uwaterloo.ca/... [uwaterloo.ca]
        It has been used to solve instances up to 85,900 cities. It does use parallelism but that is just one thing it does and not the one with the largest impact.

        There are a bunch of techniques for solving highly combinatorial problems exactly, for instance linear programming, constraint programming, dynamic programming. To the best of my knowledge all of them basically consist in doing improved enumeration one way or another, i.e. proving that a whole bunch of solutions can be ignored. Those that can't be proved safe to ignore have to be enumerated.

        Then there are also approximation algorithms, they produce a solution which cannot be proved to be optimal. For the TSP, one very good approximation algorithm is the Lin-Kernighan algorithm, especially this implementation: http://webhotel4.ruc.dk/~keld/... [webhotel4.ruc.dk]

        Among approximation algorithms, some of them have a performance guarantee, i.e. "this algorithm will always produce a solution that is at most k% worse than the optimum". The new research result mentioned here is about such an algorithm. Previously the best known k% for the TSP was 50%:

        "Now Karlin, Klein and Oveis Gharan have proved that an algorithm devised a decade ago beats Christofides’ 50% factor, though they were only able to subtract 0.2 billionth of a trillionth of a trillionth of a percent. ".

        So a very small improvement. However some people previously believed that there was a theoretical limitation to anything better than 50% and now we know that is not true.

    • by pjt33 ( 739471 )

      Not really. The seven bridges problem asks about the existence of an Eulerian path, not a Hamiltonian path.

      • Thank you. In the Königsberg seven bridges problem you can visit points/vertices more than once but you can't use a path multiple times. In the travelling salesman problem, you're supposed to visit every vertex only once and not repeat paths. Also, the seven bridges problem is not looking for an optimal answer but whether it is possible.
  • Use a biocomputer. It's not fast, but it's efficient and gives a highly optimal result:

    https://thenewstack.io/amoeba-... [thenewstack.io]

    • it's like the O(n) method of sorting, which I first heard in Scientific American ages ago. You git a strand of spaghetti for each number, each cut to the length of the number. Gather the strands into a cylindrical clump, then let gravity sort them, maybe with a couple light taps on the table. Then you pull out the longest strand, then the next longest, and so forth. It works, but... the constants involved are huged. So, O(n) to cut the strands, but the amount of time to do this for each strand is very

      • by Anonymous Coward

        it's like the O(n) method of sorting, which I first heard in Scientific American ages ago. You git a strand of spaghetti for each number, each cut to the length of the number. Gather the strands into a cylindrical clump, then let gravity sort them, maybe with a couple light taps on the table. Then you pull out the longest strand, then the next longest, and so forth. It works, but... the constants involved are huged. So, O(n) to cut the strands, but the amount of time to do this for each strand is very large, and the amount of time to measure each strand as you pull it out is very long. So it's much faster to just let a computer do a dumb-as-rocks bubble-sort.

        Time is not the only constraint. Some algorithms scale on how much memory they need. Spaghetti sort becomes impossible pretty quickly for very small values of n.

  • CS Contest (Score:4, Interesting)

    by Dan East ( 318230 ) on Thursday October 08, 2020 @04:39PM (#60586488) Journal

    A couple decades ago while in school I participated in one of the ACM programming contests. They had a dozen or so problems that you would write software to solve and submit within a few hour time period. As with many such things, it was impossible to actually complete every problem. One of them was... the traveling salesman problem. It wasn't stated quite as obviously, but that's exactly what it was. I think the real challenge with that one was for the student to recognize it was an unsolved problem in CS and NOT to attempt it in a two hour programming contest.

    Or maybe they were hoping some genius would accidentally solve it lol.

    • by DavenH ( 1065780 )
      TSP is quite solvable with the right constraints -- any idea what they were for the contest? It's intractible in the general case when n is high (say > 20), but depending on the problem you can use heuristics to eliminate many possibilities. For example, if there's spatial clustering, then it's clear the optimal solution will not oscillate between clusters rather than solve a local TSP within the cluster.
      • If the goal was simply to solve the problem, I would just do a random walk until all the cities had been visited. Sometimes "optimal" is not needed.

        • by Wolfrider ( 856 )

          --The best solution to the TSP problem may be to parallelize and send multiple salesmen out instead of just one. ;-) One salesman visits the 2 closest cities while another salesman gets another 2 at the same time...

    • by pz ( 113803 )

      Or maybe they were hoping some genius would accidentally solve it lol.

      Prof. Patrick Winston routinely asked, at the start of each term he taught the undergraduate Intro to AI course at MIT (6.034), "can anyone define intelligence?," with the hope that some genius-level student (this was MIT, after all) would actually be able to do it. To the best of my knowledge, none ever did, but he kept asking.

    • Solving it isn't that tough given a reasonably small number of vertices. The open problem isn't to solve it, but to do so efficiently. Asking students to write an algorithm to solve the problem, where one isn't expecting some amazingly good big-Oh constraint on time is completely reasonable.
    • No, the problem is already solved. What is not solved is how to do this efficiently. Doing it brute force to find ANY route is doable even if it's the most inefficient route. Search all neighbors recursively, backtrack when there are no unvisited neighbors. The answer might mean flying from NYC to LA back to Newark then to Chicago, etc.

    • by tlhIngan ( 30335 )

      A couple decades ago while in school I participated in one of the ACM programming contests. They had a dozen or so problems that you would write software to solve and submit within a few hour time period. As with many such things, it was impossible to actually complete every problem. One of them was... the traveling salesman problem. It wasn't stated quite as obviously, but that's exactly what it was. I think the real challenge with that one was for the student to recognize it was an unsolved problem in CS

  • Comment removed based on user account deletion
    • Yeah, I'm pretty stoked that computer scientists can now work out an effective route for my sales trip around the country that will shave 2.3 mm off the entire journey.

  • "they were only able to subtract 0.2 billionth of a trillionth of a trillionth of a percent"

    So why exactly is such a miniscule improvement actually interesting? They handwave over the very small improvement by saying "computer scientists hope this breakthrough will inspire rapid further progress" and "Yet this minuscule improvement breaks through both a theoretical logjam and a psychological one. Researchers hope that it will open the floodgates to further improvements."

    But why do they hope that? It's cool

    • by Ungrounded Lightning ( 62228 ) on Thursday October 08, 2020 @07:20PM (#60586896) Journal

      So why exactly is such a miniscule improvement actually interesting?

      Because when researchers think a part of the solution space has a hard limit they spend most of their time looking elsewhere. Any break past the limit, no matter how small, tells them that the limit was to some extent an illusion, that there ARE ways around or through it. Maybe some of them are substantial, rather than a token scratch. So mow that they know there might be a breakthrough here, researchers will then spend time looking for it.

    • by godrik ( 1287354 ) on Thursday October 08, 2020 @10:06PM (#60587152)

      Yes it is actually interesting. Or at least, it is potentially important.

      For lots of problem we think that there is no way to do better than some threshold. And once that threshold is broken it usually means we understand something we did not understand before and that can lead to a new wave of successive improvement on the problem

      3D packing was such a problem about 20 years ago. We did not believe we could find a constant factor approximation algorithm. (An algorithm is a X-approximation algorithm if it is guaranteed to return a solution of quality always smaller than X*optimal.) For 3D packing we eventually found an absurd 1000 approximation algorithm. The result was absurd practically but that enabled people to think about the problem using new methods, eventually we developed a 12-approximation, then 11, then 9, then 6+epsilon. All within the span of a few years.

      So the hope here is probably the same. For 30 years, we thought we could not do better than christofides' algorithm. The fact that we can do better means we have understood something we did not understand before. I do more scheduling and packing algorithm work than I do routing problems; so I haven't read the paper yet. Probably will read it this year though.

      It is also possible the technique employed on TSP could port to other path optimization problem like vehicle routing, or some restricted version of TSP such as TSP on planar graphs that are still quite applicable in practice. Or the technique could be generic enough to apply to a ton of problems. When Shmoys figured out dual approximation, the first result (Shmoys and Lawler) was nice but not super impressive. But Shmoys and Hochbaum refiend the technique and ported to so many problems packing, partitioning, many scheduling problems, many multi objective scheduling problems, (actually about half of multi objective approximation results trace back to Shmoys)

      So will the insight in this paper turn out to cascade through? I don't know. But it is the kind of result we could look at 20 years from now and say "that's when the magic happened". Or it will be a footnote in a textbook.

      • It doesn't sound like that in this case. Often times, a new approach is posted, with a very large number, and proficient mathematicians do as you say, optimize the most and start proving for smaller numbers. In this case, I bet $1 this is nothing more than a suggestion by the author's shrink or the need to justify wasting so much time and money on achieving nothing at all.

    • by samkass ( 174571 )

      What if some physicists had observed something going just a billionth of a billionth of a percent above the speed of light in a vacuum? This isn't exactly the same, as it's not a fundamental limit of the universe, but the point is that sometimes just crossing a threshold, no matter how little, implies something unique has occurred and is worthy of additional study.

  • This is certainly good news for salesmen.
  • 42...women weren’t recruited for sales forces because someone had to open the door for who’s perpetually on its wrong side...
  • by Daetrin ( 576516 ) on Thursday October 08, 2020 @06:35PM (#60586790)
    I first heard this song [youtube.com] in college a couple decades ago in Algorithms class.

    Mathematically the "longest path" problem is functionally equivalent to the "traveling salesman" problem. (Technically i believe that one is a special case of the other?) The phrase "longest path" was presumably just easier to use for a parody.
  • I get a bigger runtime speedup by removing a comment line before compilation

  • I worked on it when I heard about it, and I think about it sometimes. The infuriating thing is the retarded redundancy of the factorial solution, the same things computed again and again, but no clear way to cache it.

    I found a pretty good approximate: make random path, for each subpath aBc where a, c are cities and B is a subpath of two or more cities, reverse B if its shorter between a and c. I started with two cities for B and once no changes, went to 3 and so on. Super fast stupid simple.

  • So "saying man == sexist hate"? Or what?

    Had it been saleswoman, I'd also defended that, just as much. Duh. But thanks for implying I would not. --.--

    There definitely is nothing as prejudiced and hateful on this planet, as p.c. SJWs.
    They just managed to leverage the victim role, sheepskin and crocodile tears, to maximize their abusiveness. While having the dumb and blind masses side with *them*.

  • Door to door sales is a bad career move these days. Door to door computer science is the way of the future.
  • To get a more comprehensible representation of that percentage, my son calculated (I didn't verify) that that is equivalent to just under a savings of 1 nanometer over a light year -- or 2 mm over the trip to the Andromeda galaxy.

  • As a practicing computer scientist, I was always impressed with colleagues who worked hard on problems like this and like data compression, often obtaining improvements from a few percent to fractions of a percent. I was much more interested in finding widely used implementations that were so bad that I could get orders of magnitude. My best result was in a reverse symbol table lookup, turning addresses into names and offsets. In a typical 8000 entry (large for the day) symbol table, a linear search would

UNIX was half a billion (500000000) seconds old on Tue Nov 5 00:53:20 1985 GMT (measuring since the time(2) epoch). -- Andy Tannenbaum

Working...