New Algorithm Provides Huge Speedups For Optimization Problems (mit.edu) 129
An anonymous reader writes: MIT graduate students have developed a new "cutting-plane" algorithm, a general-purpose algorithm for solving optimization problems. They've also developed a new way to apply their algorithm to specific problems, yielding orders-of-magnitude efficiency gains. Optimization problems look to find the best set of values for a group of disparate parameters. For example, the cost function around designing a new smartphone would reward battery life, speed, and durability while penalizing thickness, cost, and overheating. Finding the optimal arrangement of values is a difficult problem, but the new algorithm shaves a significant amount of operations (PDF) off those calculations. Satoru Iwata, professor of mathematical informatics at the University of Tokyo, said, "This is indeed an astonishing paper. For this problem, the running time bounds derived with the aid of discrete geometry and combinatorial techniques are by far better than what I could imagine."
Whoah.. (Score:3)
Satoru Iwata, professor of mathematical informatics at the University of Tokyo, said, "This is indeed an astonishing paper.
I'm assuming this isn't the same Satoru Iwata of Nintendo fame... who sadly passed away earlier this year. :/
Re:Whoah.. (Score:5, Funny)
You never know; I'm sure he had plenty of 1UPs.
Oh... (Score:3)
And as it turns out, P=NP. Stay tuned for more exciting developments!
Re: (Score:1)
Math is hard.
Re:Oh... (Score:5, Funny)
Math is hard.
Or, more accurately, math is believed to be hard.
Re: (Score:3)
Re:Oh... (Score:4, Interesting)
Why, I oughtta!
Anyhow, I'm checking out the paper. Well, I will be after snuggle-time. Yay! A use-case for a tablet. I've a bit of work in this area though, specifically, modeling traffic - traffic optimization *is* hard. Throughput, where humans are concerned, is nearing an attempt at modeling chaos (and for those who think perfection is possible, I welcome their insights).
I've yet to read it but I suspect it's not a huge advance so much as it is working on prior modal work. (Not all work gets published, for better or worse, and some remains a trade-secret.) My guess is that (and this is just from a quick look, absolutely not to be taken as an authoritative statement) this is just further optimization, refinement, and it appears like it may be a good step forward.
I'm not sure that I agree with the phone analogy but, well, okay... We're not children.
It looks like, if you've got highway traffic going to A, B, and C and you have exit/merge nodes 1, 2, and 3 then you can take the current values and include the street traffic - as well as time of day data, up to and including the myriad traffic uses, that you can then more accurately predict if new merge 1.5 will increase traffic for traffic going from A to B while decreasing overall traffic (of increasing it) from B to C and if it will actually reduce surface street traffic. It also looks like you can use this to predict if an exit at 1.5 will decrease overall time for, example, delivery vehicles that need to reach surface streets or if they're better shoveled off to a new exit at 2.5 or left to exit on 3. You can also deduce predictions (never certainties) for throughput from A to C, A to B, and which exits or mergers are more likely to congest or reduce traffic on the highway as well as optimize for certain functions such as the aforementioned delivery trucks.
Basically, in programmer terms, the more subroutines you can have the more accurately you can refine your answers and the better the results *can* be. Of course this adds complexity and computational difficulty. This algorithm optimizes existing work by allowing you to use more subroutines more efficiently - perhaps think of it like calling a library or OOP (I guess?). As you were a teacher, it's akin to being able to (more easily) tailor an individualized lesson plan for each student who may, or may not, excel and then, at the end, they'd all be able to pass their test which may, or may not, be also tailored for them. Basically, you'd be able to blab a whole bunch of data at them and they'd then get the appropriate information, tailored to their needs, and may output the correct answers on the test.
This doesn't do anything (from what I see so far) to actually ensure you're inserting good data (of course). I am obviously stretching the analogies quite a bit but math is hard. Well, no... It's not hard so much but, rather, it is that most people learned via rote and not conceptually. Some idiot decided to make word problems, not a bad idea. Some idiot came along after them and told them to take the word problems apart and to ignore the words. Nobody actually showed them the concepts of the maths themselves and why the answers are the way they are. I could type a novella on the subject... Complete with examples! Nobody would listen/read it and I don't blame them.
In short, this is "just" improving on existing models and appears to be doing so by making various inputs more accessible or, more accurately, more streamlined. Again, a very cursory scan was done and I could be way off. Another quick look makes me think I'm correct. I really need to read it to be certain.
'Snot hard. *nods* Looks like good work from a quick glimpse. It'd be a shame to see it wasted on optimizing phones. ;-) Another quick peek does, indeed, note that they're claiming improvements on current models. Note: This was written in a few sporadic posts and the paragraphs are probably horrifically out of order. If they don't make sense then swap 'em around a bit until they
Re: (Score:2, Interesting)
Why, I oughtta!
Anyhow, I'm checking out the paper. Well, I will be after snuggle-time. Yay! A use-case for a tablet. I've a bit of work in this area though, specifically, modeling traffic - traffic optimization *is* hard. Throughput, where humans are concerned, is nearing an attempt at modeling chaos (and for those who think perfection is possible, I welcome their insights).
For controlled intersections, ACTUALLY control them.
Stop signs, traffic signals, etc. are all suggestions in the eyes of your typical driver.
If a yellow light meant "we're raising the fucking spiked pylon barrier" there would be a lot fewer issues.
It won't be perfect (we still have people who try to beat trains and drive round/through the arms as they're lowering) but it would be a lot better, especially after weeding out the retards.
Once you control behavior you can then look to optimizing flows of traffic
Re:Oh... (Score:4, Informative)
Great, in theory. Then you end up with some drunk guy, driving backwards, on a one way street. As someone who made a career of modeling traffic, well, if we had intelligent drivers then I'd not have been able to retire at 50 after selling my company. Some dumb ass will ram the pylons and actually increase the overall time needed because emergency crews will be on-scene and the vehicle will be disabled. YOU might obey the rules but, well, you know how far that gets you.
Also, in an ideal world? We'd not need one single signal light nor stop sign. Yup. It could be done with ease. The only flaw in my world-overthrowing-plan is that humans are inherently stupid and more so when they're behind the wheel of an automobile. Trust me on this - I dare say that this is the one subject where I can speak authoritatively. Coming up with, and believing in, an ideal system that relies on the intelligence of the operators is as doomed as any political ideology that follows the same principles - meaning you're gonna need to be really draconian and authoritarian.
In short, we could just take the cars away. That'd make about as much sense as any other solution. Which, by the way, is me agreeing with you. It won't be perfect. No, it won't. However, your idea may actually mean that the overall throughput slows down and doesn't return to normal or result in an increased rate. It takes, on average, as long as THREE YEARS for them to acclimate to a new traffic pattern. Some, like the 'Magic Roundabouts' in the UK are just skipped by the locals who simply avoid them rather than adapt.
For every model of improvements you make, we'll invent a newer and dumber human. So, no... It won't be perfect. The pylons might be a stretch too far. Some places have actually opted to remove the tire puncture strips for those who go the wrong way on a one way route. Why? Idiots ended up holding up traffic. Sure, it had the desired result but there are more idiots. See the UK where they put in the pylons that rise up from the street, for example, and then lower only for certain vehicles with an attached sensor. People do exactly what you think they'll do - on a *very* regular basis, some of them even picking up speed to try to beat it. They don't. They do it again. And again... And again... And again...
Pylons probably aren't the solution - drivers education and mandatory re-licensing with strict adherence to testing standards might go somewhere but that's politically infeasible in my country.
Re:Eliminating traffic problems (Score:1)
Re: (Score:2)
You know what actually works well? Cloverleafs. Expensive, though.
Me I want to redesign *everything*.
I'd like to build a town where dwellings and stores are up on pylons like mushroom heads; that allows a huge amount of ground to be devoted to nature as compared to dwellings on the ground. I want them separated from neighboring dwellings by at least 50 feet of open space.
I want transport between/along the dwellings to be a dual purpose line where one half is a monorail with community cars, and one half is a
Re: (Score:1)
You have it upside down. If all the 'stuff' is elevated, the ground winds up being dead because it doesn't get enough sun.
Re: (Score:1)
Re: (Score:2)
This is one thing that engineering can not, currently, completely resolve. It is my opinion that we need more educated drivers. With educated drivers, for instance, we'd never need a single stop sign.
Re: (Score:2)
Self-driving cars.
Not a viable solution at this time.
Enough said.
No, you could just as well have said "magic fairy dust" and been almost as accurate. They'll be here, eventually. It won't be like you're expecting nor will it be in the time frame you're expecting. If you disagree, find an escrow service and put up some numbers and I'll be willing to make a large bet with you. Let's play a game, shall we?
I'll give you 5:1 odds for 10 years, following today's date, that the percentage of fully autonomous private transport vehicles on the h
Re: (Score:2)
Hang on, what is the current vehicle replacement rate? If you start right now and every new vehicle is fully autonomous, what fraction of the total national vehicle fleet could you expect to have converted in only 10 years?
Re: Oh... (Score:1)
Re: (Score:2)
You sound like a person who might not have heard the notion that yellow means "clear the intersection" but even if you aren't, I'm posting this here on the off-chance it might help someone who encounters this later.
Okay, sorry to tell you, but that makes no sense. Anyone who would be able (or rather need) to "clear the intersection" shouldn't be able to see the yellow light (well, there may be some intersections with the lights on the exit side), and if you can see the yellow light, it shouldn't be uses as an excuse to enter the intersection just so you can then clear it.
Re: (Score:1)
Re: (Score:2)
Traffic flow would be analogous to fluid flow, but fluid isn't stupid.
Sure, if you define "not stupid" as being satisfied with the right amount of cars arriving at a destination, instead of bringing specific cars from source to destination.
Re:Oh... (Score:4, Insightful)
Re: (Score:2)
You are, in fact, preaching to the choir, yes. Some days, I hope we do go extinct before we ruin it for the truly enlightened beings that follow us. I could go on, oh, I could... As tempting as it is, well, you're already aware of this. We could have nice things but humans are idiots. I'm glad I'm not a human.
Re: (Score:2)
Canadians can zipper just fine.
USA... biggest problem (besides irrational competition when it doesn't matter) is that the current "war on police" means no more traffic enforcement. Drivers have gone from thoughtful to senseless to hazardous.
Re: (Score:1)
Canadians can zipper just fine.
USA... biggest problem (besides irrational competition when it doesn't matter) is that the current "war on police" means no more traffic enforcement. Drivers have gone from thoughtful to senseless to hazardous.
Traffic enforcement in the USA means flashing blue and/or red lights. This causes "on-looker delays".
Re: (Score:2)
In Texas they're strobe lights too.
I was pretty much in tears after driving through Dallas. After I calmed down and thought it through, it wasn't the 800 miles I'd already driven that day or the pain of dealing with roadworks in Dallas, it was the sensory overload caused by strobe lights on police cars.
I could migrate to Texas - friends and jobs already there - but it's not worth the mental pain involved in driving at night.
Re: (Score:2)
the current "war on police" means no more traffic enforcement
Pulling over all of the black drivers does not count as "traffic enforcement".
Re: (Score:2)
I think it's because if two lanes go down to one and the one lane has a speed of 60 then the only two ways to get maximum flow is to either:
a. double the gap between the cars prior to merging so that the merge can happen at 60.
b. merge ahead of the single lane so that the merging speed is around 30 but there's time to accelerate again before the single lane.
a. feels smoothest and can be done with minimal speed changes but, once it goes wrong causes a merge speed of zero at the pinch point until it sorts its
Re: (Score:1)
... It's frustrating when you get to the pinch point at a low speed and then accelerate because that pretty much proves it lack of planning and anticipation...
The act of accellerating should be what opens the gap behind you for the car in the next lane to merge into. And vice-versa.
That means that the accelleration should be moderate and even, so the gaps open as needed.
Unfortunatly, many local authoraties -lower- the speed limit at the merge instead of increasing it, thinking that they are for "safety". What it actually does is, lowering the spead limit as the cars merge causes accidents and reduces safety!
In that case drivers have to reduce speed -before- the
Re: (Score:2)
The act of accellerating should be what opens the gap behind you for the car in the next lane to merge into.
Yes but it needs to happen so that by the time you've actually reached the pinch point you're doing the correct speed.
Two lanes of traffic doing 30mph. They merge as they accelerate to 60 which opens up the gap. If they don't start accelerating before they merge then they enter the pinch at 30 - which means that the two lines of traffic behind are doing 15 etc.
(Ignoring increasing gap size between car
Re: Oh... (Score:1)
Re: (Score:2)
traffic optimization *is* hard
Veering off on a tangent here - is there a good introduction to this subject suitable for an advanced newcomer? I've been curious about this ever since I started driving on California freeways regularly in the last few years, and realized how poorly optimized many of the interchanges are for actual humans. (Anyone who has ever driven past Emeryville will know what I'm talking about.) I always pictured it as some kind of particle dynamics problem, except the particles are ir
Re: (Score:2)
Yes, there are some now. I was at the cusp which is why I'm where I am today. I am a mathematician - I modeled traffic 'on a computer.' Which, to be fair, meant dealing with TB sized data sets in the late 90s. It's still a fairly young industry. I'd expected to remain in academia but was offered a no-bid contract for the State of Massachusetts via way of my advisor while still doing my thesis - well, preparing to defend it. Needless to say, it was *very* lucrative and expansion started almost as soon as I a
Re: (Score:2)
Re: (Score:2)
Math is hard.
Or, more accurately, math is believed to be hard.
Or, more completely, math is believed to be hard by humans, a species known to be shy on intelligence.
Re: (Score:2)
Sorry, but math *is* hard. Of course, it's also easy. It depends on exactly what problem you're looking at.
Consider Goldbach's Conjecture...something easy to understand, and probably true, but so far unproven. Or the distance between adjacent primes. Given all the primes up to some point, predict the next one. Sometimes you can, but most of the time there's no known way.
But the number of sides on square is easy. So it can be either. And simple statements can be enourmously complex to prove, but you o
Re: (Score:2)
At least the moderators got the joke.
Re: (Score:1)
Math is hard.
Or, more accurately, math is believed to be hard.
Or, more completely, math is believed to be hard by humans, a species known to be shy on intelligence.
The Universe is hard! Things only seem easy when we are accustomed to them, and have forgotten how hard it was when we started... 8-)
Re: (Score:1)
Re: (Score:2)
Oh Hi Barbie! Long time no see!
Re: (Score:2)
Well, it depends. If the proof isn't a constructive proof then it wouldn't weaken current cryptography any more than the possibility of quantum computers does. But it would imply that if you found the right approach it would be a lot weaker, without giving much clue as to what the right approach was. But quantum computers are already being built. (Possibly not useful ones, and perhaps useful ones are impossible, but...)
Re: (Score:2)
This gets mentioned a lot around here and I've never understood. What difference does that equation make either way?
If they are equal, then it means an entire class of problems can be solved more efficiently.*
*(In theory.......in practice, since we don't even know what the solution is, it may be the solution is "efficient" only for extremely large datasets, say, with quintillion elements. Believe it or not, there are algorithms like this).
Re:Oh... (Score:5, Informative)
To give a brief and hopefully laymen-friendly explanation:
It isn't an equation per se; P and NP are classes of computational problems. Roughly speaking, P is the set of problems that can be solved deterministically in polynomial time with respect to some input parameter, and NP is the set of problems that can be solved non-deterministically in polynomial time with respect to some input parameter (hence the "N"). A more approachable definition of NP for laymen is that it comprises problems for which a solution can be verified (or rejected) in polynomial time. For instance, prime factorization of an integer is in NP because, given an input integer and a list of prime factors, you can verify that the list is actually the prime factorization of the integer in polynomial time (by multiplying them together).
The significance, aside from pure theoretical interest, is that a lot of very interesting practical problems are known to be in NP, and knowing that those problems could be solved in polynomial time would be very useful.
Re: (Score:2)
Who are you who is so wise in the way of maths? That's a great description and you should publish it somewhere more meaningful than /. - maybe the maths StackExchange site.
For the moderators - that's a pretty good description. Albeit not as detailed as it could be but it needn't be so it seems like it suits the purpose fine. That's one of the best descriptions of nondeterministic polynominal time problems that I've seen. People have issues with the P vs NP. Truth be told, it's not that easy to grasp at firs
Re: (Score:1)
a lot of very interesting practical problems are known to be in NP, and knowing that those problems could be solved in polynomial time would be very useful.
It get's even more interesting when one considers the already proven fact that any problem in NP can be converted or restated in terms of another problem in NP. So for example, the Subset Sum Problem [wikipedia.org] can be converted into the Boolean Satisfiability Problem [wikipedia.org] or vice-versa. What this means is that if one NP complete problem can be solved in polynomial time with respect to the input, they all can be since any one can be converted into the one that's supposedly solvable in polynomial time. In layman's te
Re: (Score:2)
No. Every problem in NP cannot be converted to every other problem. (unless P=NP)
There are 'hardest' problems in NP that every problem in NP can be converted into.
But there is an infinite number of 'difficulty' levels and problems cannot be converted to an easier problem.
(you do mention NP complete in your second sentence so I suspect you 'miswrote' rather than misunderstand.)
Re: (Score:1)
Re: (Score:2)
I went hunting for a source that could tell me what polynomials and polynomial time are too.
All these 'great description' comments are from people that have the CS/maths education to understand them.
The rest of us have other skills.
Re:Oh... (Score:5, Informative)
Think of a huge jigsaw puzzle. You might have thousands of pieces that could each go in thousands of possible places, and while you may know tricks to speed up the process, solving it would surely take a lot of trial and error and time. When you did finally solve the puzzle, though, it would only take you a moment to check the picture against the box and prove your solution was correct. That makes it a NP problem.
Another jigsaw puzzle has every piece labeled with column and row, so you can put the whole thing together in one sweep—no trial and error needed. Both solving the puzzle and verifying the solution would be simple. That makes it a P problem.
If P=NP, it would indicate the first jigsaw puzzle has labels on its pieces too, just harder to see. If you can find them, solving the puzzle becomes trivial.
There are a great number of NP problems in the world; if P=NP we could find the solutions to many intractable math problems within our grasp.
.
Unfortunately, one of those is encryption. Many encryption methods use a NP math problem as a lock and its solution as a key. We assume it would take an impractical amount of time to solve the math problem, so the only way in is by knowing the solution. But if P = NP, breaking that lock (by solving the math) could become as fast as using the key. Oops!
Re: (Score:2)
This is another well worded description. I am impressed with /. tonight. Usually, where mathematics is concerned, I only chuckle at the replies. As a group, we do well with comp sci and physics and even chemistry. Not so much for maths. Not long after I retired, I was invited to and took up the chance to give instruction at the University of Maine at Farmington. While difficult, it's not impossible to give applicable descriptions for difficult mathematics concepts. They're much easier to grasp when put into
Re: Oh... (Score:4, Funny)
I am an inconsiderate clod you... Oh, wait...
In Soviet Russia inconsiderate clods you?
Re:Oh... (Score:4, Insightful)
Re: (Score:2)
Indeed, though large boards often use the same cut many times. Sudoku would have been a better comparison but went with the physicality of the jigsaw metaphor.
Thanks for the correction!
Re: (Score:1)
Re: (Score:1)
One of the best analogy for P=NP I've heard about, thanks for sharing.
Re: (Score:2)
This gets mentioned a lot around here and I've never understood. What difference does that equation make either way?
Here's my crack at a simple explanation:
P is a class of easy problems that computers can solve quickly.
NP is a class including hard problems that computers can't seem to solve quickly.
People are searching for a fast way to solve the hard problems. It's like a holy grail of computer science.
If someone finds this holy grail, there would be huge consequences. We could quickly solve hard problems like protein folding, which would help us unlock the mysteries of life. Lots of shit would be turned upside down.
in P (Score:1)
If you read the press release from MIT, this discovery was about problems squarely inside P. Namely, they were (for a certain class of problems in P) able to reduce the complexity from N^^5 or N^^6 down to N^^2 or N^^3. So there would seem to be no implications for the P=NP problem.
Bad news for them (Score:4, Interesting)
I guess they never heard of the "No Free Lunch" theorem for optimization, which believe it or not is the name of a proven theorem by David Wolpert that says the following is rigrorously true.
Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum.
The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
That is the better it works on a finite class of problem, the worse it is on most problems.
it is even true that a hill climbing algorithm blunders into the minimum in the same average time as a hill descending algorithm. (yes that is true!).
Look it up. It's the bane of people who don't understand optimization.
The only things that distinguishes one optimization algorithm from another is:
1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of
2) your performance metric is different then the average time it takes to find the global minimum. for example, something that limits the worst case time to get within epsilon of the minimum.
Re:Bad news for them (Score:5, Interesting)
To be fair, the actual article doesn't claim to violate the no free lunch theorem because it only applies to "some" problems. It's the headline that violates the no free lunch theorem. But one aspect of the no-free-lunch theorem that usually proves to be insanely diabolical is that it turns out that it's usually VERY hard to define which problems an alogorithm will do best on in general. Often you can see a specific set for which it is obvious. But as you complicate things the obviousness part goes out the window. FOr example, if you know your surface is convex and has a very simple shape like a multidimensional parabola, then it's trivial to hill descend to the bottom. But once you start putting in multiple minima and arbitrary discontinuities it becomes hard. Interestingly in high dimensional spaces know something is smooth isn't very useful as smoothness only removes a low number of degrees of freedom--leaving behind a smaller but still high dimensional space.
Thus if there is a breakthrough here it's not the algorithm, it would be the ability to specify what kinds of problems this will work well on!
Re: (Score:3)
Re: (Score:2)
Indeed. I've given it a bit of a scan at this point (posted up-thread) and it appears that, with first glance, the best way to describe it (to this crowd) is optimized sub-routines in a program. It looks like optimization done where the additional data is being inserted, calculated, and run. It's sort of like, I guess, object oriented programming (for lack of a more accurate description). I can see this coming in handy in traffic modeling and will give it a good, more thorough, read later to see what I can
Re: Bad news for them (Score:2)
The fact that people obviously versed in the art are commenting on Slashdot, rather than excitedly hunching over the algorithm, putting it in whatever they work on, is telltale sign that it's not as revolutionary as the gushing summary states.
Re: (Score:2)
This article is the real deal as they do address what types of optimization problems and give bounds on the speedups over previous techniques. In addition, cutting plane techniques have proven to be very successful at solving integer programming problems and have yielded good/optimal solutions to large NP-hard problems such as traveling salesman.
As for the NFL theorem, it's not very surprising. By looking over all optimization problems you are considering an enormous set of functions. The "average" funct
Citations of the No Free Lunch theorems (Score:2)
https://en.wikipedia.org/wiki/... [wikipedia.org]
Re:No it's not a special case (Score:4, Insightful)
Averaged over all problems, the time it takes to figure out which special case one has plus the time to solve that special case has to be the same as any other algorithm. The escape clause is if when you say "cutting plane" you are intrisically restricting the problem to a special case. In that case alrgorithmic speedups are plausible. Likewise is cutting plane is a class of algorithm and you are comparing variations on that particular class of algorithm then speed ups are again plausible. But if all possible problems are addressed as cutting plane problems then No it is not possible for there to be any average speed up.
But please realize that all is not lost. It is very likely that all problems of interest to humans is a tiny set of the space of all possible problems. It may well be that we are only interested in a very special subspace. I'd even bet on it. But defining what that subspace is will be hard.
Are you Jeremy Clarkson? (Score:1)
"Averaged over all problems, the time it takes to figure out which special case one has plus the time to solve that special case has to be the same as any other algorithm."
In practice you know what type of problem you're solving, and use the appropriate approach.
So for example a Neural Network 'fast solve' is currently derived as a back propagation gradient solver. The math is resolved like that and then coded like that. My stock predictor algo is a cutting plane solver because I can exclude large amounts o
Re: (Score:2)
The NFL theorem relies on all problems being equally likely. In practice, there are many known classes of problem where known algorithms do close-enough-to-optimal on problems which turn up in practice, and also can t
Re: (Score:2)
The proof of correctness of the Metropolis-Hastings algorithm (to pick one example) relies on the fact that the transition probability for its proposals is symmetric (that is, Q(x|y) = Q(y|x)).
That's the Metropolis-Hastings algorithm in chapter 1. The one in the rest of the book doesn't assume that. However, it does assume that the probability is non-zero.
Re: (Score:3)
This "non-repeat" issue is a red herring. The proviso on "not repeat" is not meant to exclude reversable trajectories. It simply means that going back to an earlier state will be a free-pass in terms of counting moves. We won't add that to your count. So this has nothing to do with the applicability of the no-free-lunch theorem
Re: (Score:2)
That makes more sense. I figured it couldn't be that simple...
Re:Bad news for them (Score:4, Insightful)
I totally agree with you. I indeed left out the issue of incompressible optimization functions and their infeasibility in memory. More generally, I also believe that the NFL threorem does not imply despair but rather that the problems that interest humans are a tiny subspace of all possible problems. On this subspace better average performance is entirely plausible. The rub, is that to make that claim you have to be able to specify what the subspace of interest to humans in. I think you might be able to define that in terms of algorithmic complexity. But that alone doesn't tell you if your algorithm must be better than another one on all problems of a specified maximum complexity. Thus any time someone says their algorithms outperform without telling us in what subspace they achieve this there is no proof of superior performance. Since the paper actually does make this claim it's confusing to sort out how they are limiting the problem. Likely this is my own failing at understanding their mathermatical definitions.
to be more complete. the NFL theorem ignores the computational cost of deciding the next move in the search as well. so really the NFL is counting the average number of guesses required not the time.
Your statement that NFL relies on all problems being equally likely is very close to equvalent to my statement that the subspace of problems of interest to humans is smaller than the subspace of all problems. So I think were in agreement.
The NFL does not imply optimization is hopeless. It just says a claim of improved speed is dubious till you tell us what subspace of problem you improved it on. This is what I stated to begin with. If an algorithm is faster on some problems then it's slower on most other problems.
Re: (Score:2)
As you know, patterns in the real world have structure to them and have dips and valleys, characteristic shapes, and repeating fractal elements. I think what you don't appreciate enough is that there are classes of optimization algorithms (e.g: genetic algorithms or neural networks) which have a very GENERIC way of working with MANY kinds of search spaces, some of which one can't even imagine.
What you say about
Re:Bad news for them (Score:5, Insightful)
Right. I don't disagree. The sublty of the NFL theorem is not in its crass conclusion that brute force is just as good as anyhting else. it's in the realization first that algorithms only work on subspaces well. And that the real trick is not in creating the algorithm but being able to say what subspace it works well on.
TO see that consider the following thought experiment. imagine we divide the space of all problems into many subspaces and on each of these subspaces we have a very fast algorithm. Surely then this defeats the NFL theorem? No. The time it takes to decide what subspace solver to use, plus the time it takes to use it would be the same as brute force.
Thus anytime one says an algorithm is better, if you can't say when it's better, one hasn't quite solved the issue.
The escape clause is that it's very likely the class of problems humans care about is a very small subspace of all problems. But defining what that meansis hard.
Re: (Score:3)
It's naive to believe that such a thing is universal to an entire problem set, and is as solid as colloquially suggested.
But, more than that, optimisations like this often work by transforming one type of problem into a related, but different, equivalent problem. In doing that you are indeed shifting problems between problem spaces and, thus, between algorithms of different optimisation. Though there is some conversion involved, it is by no means guaranteed that such conversion is just as hard as solving
Re: (Score:2)
Yes I did explain in the follow up post to my original post. I said that the paper made no such claim. Instead it was the headlines that made the claim.
Re: (Score:2)
Marshal and HInton did some work on this. It's a good read. I'll find it for you.
http://arxiv.org/pdf/0907.1597... [arxiv.org]
An interesting quote in 1.1 would be:
Subsequently it was shown that the No Free Lunch theorem only holds if the set of objective functions under consideration is closed under permutation (c.u.p.).
It's worth a read. IIRC, it was cited later by Hinkley and further work has been done since then but, well, I'm kind of lazy at the moment. Either way, it's worth looking at if you're curious and have been out of academia for a while.
Re: (Score:2)
THanks. I've read that. I think if you look at some of my other responses above that you will see that I agree that C.U.P problems are not of interest to humans. The real issue is not that. The issue is defining which algorithms are useful for which kinds of subsets of problems (each of which is non C.U.P. but in a different way. That is, unless you can say what your algorithm is good for (clearly) then all you are doing us fracturing the subspace of all problems into subsets but you still don't know w
Re: (Score:2)
Oh, I agree entirely. I just thought you might find it interesting if you'd never read it. I was kind of fascinated by their process but their outcome was as anticipated - they didn't conclude anything "new" or "revolutionary" really. Also, to be fair, I'm a human (sort of) and this is of interest to me, albeit reflectively. My company modeled traffic - vehicular and then increased model pedestrian, as well. (To give a small hint - I sold in 2007 and, obviously, retired. If you know the market then you can
Re: (Score:2)
The coast of the united kingdom is exactly 43 long, but I'm still working out what undiscovered units that is in. :-)
Re: (Score:2)
Look at the big brain on Brad.
Anyway, this has nothing to do with no free lunch... read the goddamn article if you want to know what the authors are actually saying. http://arxiv.org/abs/1508.0487... [arxiv.org]
It's an asymptotic speedup for certain classes of problems.
Re: (Score:2)
By the title of the paper -- A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization -- I would say that they are not trying to provide a truly general optimization algorithm, but one that is specific to combinatorial and convex optimization. Hence, the NFL theorem is not violated.
TFA headline gives the impression that they would be talking about a truly general algorithm, but this is actually a manifestation of the "No Competent Editor" theorem.
Re: (Score:2)
I guess they never heard of the "No Free Lunch" theorem for optimization
What makes you say that?
Averaged over all optimization problems every possible algorithm (that does not repeat a previous move) takes exactly the same amount of time to find the global minimum. The collorary for this is that: If you show your algorithm outperforms other algorithms on a test set, then you have just proven it performs slower on all other problems.
The only things that distinguishes one optimization algorithm from another i
Re: Bad news for them (Score:2)
The title of their paper says it's a complex optimization algorithm.
Re: (Score:1)
... 1) if your problems are restricted to a subspace with special properties that your algorithm takes advantage of ....
Any Engineer knows that -all- problems are of form 1. Solving things for all possible cases is a Mathematician's fantasy! 8-)
Smartphone thickness (Score:3)
Is not an attribute to penalize. Thicker phones are sturdier (less prone to bending, among other things), easier to grip (more surface area on the edges, where you grip it while talking, and feel more substantial. Phone cases offer protection for the device and mitigate all these shortcomings.
Looking back, Zoolander only got one dimension right when it comes to phone size. The joke now would be a phone that is the size of a sheet of paper (and the 13.9" screen has 6204ppi).
Re: (Score:2)
Re: (Score:2)
I don't understand. (Score:4, Funny)
I read the summary. I didn't understand it. Then I read the article. I didn't understand it either.
Re:I don't understand. (Score:4, Informative)
Consider minimizing x^2 when x can take values in -10 to 10 (we know the answer is 0, since we only consider real valued numbers). If we wanted to solve this problem, there are several approaches; some example approaches are: randomly try a lot of different values, set the derivative to zero, or try a cutting plane algorithm. In general, we might not know the analytical expression for the function we are trying to minimize (or it might be too complex) so we can't really find the derivative efficiently. Derivatives can also be computationally expensive to compute, so let's ignore that approach.
What we can do is to say let me find a solution for which the function is less than some threshold t, and then keep reducing t till I can't find any more solutions; this is what the article meant by finding a smaller circle inside a larger one (for each value of t, I try to find solutions that are smaller than t).
What cutting planes do is chop up the original function in to pieces - in some pieces, I know the value will be much larger than my threshold (so I don't have to search in those pieces), and in others it might be smaller - I focus on searching these pieces to find a value that is smaller than my threshold (after which I can reduce the threshold and try again). This is what (in a simplistic sense) cutting plane algorithms do; they chop up my search space.
How we select the points for chopping is crucial - bad choices (say choosing one point after another at random, or trying points very close to each other), I spend a lot of time chopping unnecessarily (or not benefiting from chopping by much). We also want to make sure our cuts really do divide the problem in to pieces which I can discard searching, and those pieces I discard should (ideally) be quite large. Until this work, the time taken to decide where to chop was n^3.373; they brought down the time to n^3 (where n is the number of variables that the function I am trying to minimize takes as inputs).
They also said that for special classes of functions, they can really improve the total computation time significantly (from n^6 to n^2).
I'm glossing over (and am certain I've got some details wrong) many issues to give a taste of the big picture idea of cutting plane approaches in general; there has been decades of work on these problems that you can read (I recommend anything written by Prof. Stephen Boyd as an introduction to some of this research).
Re: (Score:2)
Thank you for trying to explain it to me. I guess I should say I don't understand how the theoretical meets the practical here. Then again, I'm just a web programmer.
The theoretical. Sure, I read the article about a circle and trying to find the better, smaller circle within it, and I can imagine slicing into the circle to find it. No problem there.
The practical. Also, I noticed how they started off with real-world problems like designing the thinnest, most durable smartphone with the longest battery life.
B
The two key contributions (Score:4, Interesting)
Some very interesting results, but the two key contributions are (almost verbatim from the article):
1) With the best general-purpose cutting-plane method, the time required to select each new point to test was proportional to the number of elements raised to the power 3.373. Sidford, Lee, and Wong get that down to 3.
2) And in many of those cases (submodular minimization, submodular flow, matroid intersection, and semidefinite programming), they report dramatic improvements in efficiency, from running times that scale with the fifth or sixth power of the number of variables down to the second or third power.
So this seems to be a better cutting plane approach that improves the cutting process by reducing the time to find the next test point (in general), and for certain structured problems (like SDPs) this approach reduces the computation time significantly.
This does raise some interesting questions, such as: is there a limit to how fast you can (in a general problem, not using a specific problem's structure) find the next test point? Even if we don't know what algorithm gets us there, it would be useful to have a bound for future research.
Re: (Score:2)
From my reading of the English sections of their paper (someone had fun with LaTeX equations in that one), I see worst-case running times for problems where it is assumed an Oracle (a subroutine) can provide the answer to a certain question in constant time, or at least a low-complexity amount of time. The running times are given *with respect to an Oracle*, meaning that the running time of the Oracle is not considered in the worst-case running time.
It is known that for certain specific kinds of problem
Memory vs. speed? (Score:1)
Algorithms can be fast, or use low amount of memory. Doesn't the bucketing used here require additional memory over other methods? In the end, just an optimization of speed, while requiring more memory?
What use are the tables comparing different algorithms when only the computational complexity is given, but not the memory requirements?
Not backing up their claims with real runtimes!! (Score:1)
One of the tricks that people play when they claim to have parallelized something or improved computational complexity is that they don't back up their claims with any real runtimes. They provide a theoretical evaluation, but they haven't actually shown that anything is actually FASTER in a practical way. Maybe it is, maybe it isn't. They haven't made a proper case for it. Maybe I missed it. I'm also not sure they actually coded it up and tried it. As a researcher who puts time and effort into actuall
Wrong aim (Score:2)
All these algorithms are exponential for optimal solutions. (Well, unless P=NP, but that looks more and more unlike all the time. And even if it was true, high-exponent polynomial is not really better in practice anyways.) Hence _nobody_ uses the algorithms for optimal solutions in practice. What is used are approximation algorithms that provide a good solution, and there the quality measure is not "speed", but speed vs. result quality.
This may be nice theoretical research, but the practical impact is rathe
Re: (Score:2)