$50,000 To Solve the Most Complicated Puzzle Ever 180
An anonymous reader writes "A team from UC San Diego is using crowd-sourcing as a tool to solve the most complicated puzzle ever attempted, which involves piecing together roughly 10,000 pieces of different documents that have been shredded. (The challenge is designed to reveal new techniques for reconstructing destroyed documents, which are often confiscated by troops in war zones). The prize for solving this jigsaw puzzle is $50,000, which the UCSD team has decided to share among the people who participate. If they win, you would also receive cash for every person you recruit to the effort! The professor leading the team, Manuel Cebrian, won the challenge two years ago, so his odds of winning again are great"
only 50k for a problem that complex? (Score:5, Insightful)
only 50k for a problem that complex? If you could solve this problem, I say copyright and make millions off of the algorithm.
Fifty cents a person (Score:4, Insightful)
So, it's essentially worth less than a pack of gum.
Re: (Score:2)
Hey, it's a month's wage in some poor countries, start building a document rebuilding plant somewhere in backwater Africa.
Re:Fifty cents a person (Score:4, Informative)
Hey, it's a month's wage in some poor countries, start building a document rebuilding plant somewhere in backwater Africa.
Sorry to mix actual data in your First World prejudices, but the GDP per capita of the poorest country is over $300 [wikipedia.org], so monthly it would be around US 18$.
There are only 15 countries with a GDP per capita inferior to 100$ month
Re: (Score:3, Insightful)
Hey, it's a month's wage in some poor countries, start building a document rebuilding plant somewhere in backwater Africa.
Sorry to mix actual data in your First World prejudices, but the GDP per capita of the poorest country is over $300 [wikipedia.org], so monthly it would be around US 18$.
There are only 15 countries with a GDP per capita inferior to 100$ month
Right, because income is evenly distributed there, and there aren't dirt poor people living off almost nothing. Plus, you're using PPP GDP per capita, rather than GDP per capita at nominal exchange rates. If I pay someone in another country $1, they get to buy what $1 buys in their country, not what $1 buys in the USA.
Sorry to mix actual facts into your misrepresented data.
My girlfriend is a billionaire! (Score:2)
My girlfriend lives about 1.5 blocks from Warren Buffet, so if you take the per capita for the two blocks between Warren and her, she's a billionaire! :)
Unfortunately, she's having a hard time coming up with the money for a transmission for her Kio Rio. So statistics suck and are very misleading...
Re: (Score:2)
This is almost the same as the DOD $50,000 "challenge" to recover shredded docs remember?
So what did they do, spin it from the "Bad Dept of Defense" to "a college group"?
AC nailed it, tech that can do that is "worth" billions in lifetime revenue, so what's with this $50,000 a piece?
Re:only 50k for a problem that complex? (Score:5, Informative)
Re: (Score:2)
Copyright is automatic. What prevents them from taking the $50000 and then making millions off of it?
Re: (Score:3)
This solution is a bit of a hack - it's not what the $50,000 is actually meant for, they're looking for an everyday computer-based method. Fair play to them and all though.
As if relevancy has to do anything with it (Score:2)
Re: (Score:3)
The US Government does not get a free ride when it comes to patents. They may disregard a patent for national security purposes. For example, when the antrhrax attacks were underway the maker of the patented first line drug did not have sufficient quantities of the drug and the USG basically said, "then make them or we will do it for you and not give you a licensing fee." They did not do this, but that is the type of situation where they can override a patent, not like, "hey, nice shiny thing... I'll just t
Re: (Score:2)
Better send the $50K to Iran, as they were able to do it in 1978...Oh, wait, that involved computors, not computers.
Doesn't IBM have some algorithmic tech that can help with this? I imagine it involves scanning each strip, and figuring out a way to do some sort of edge analysis of each strip, for each side. Do some sort of FFT or DCT for the edges, and then come up with a way to join similar strips' edges for each side of the strip together. Then, run the joined images of likely sets of strips through an OC
Re: (Score:3)
Re: (Score:2)
Better send the $50K to Iran, as they were able to do it in 1978...Oh, wait, that involved computors, not computers.
Actually, people were called computers before computers started meaning actual machines. They did the same job, and were usually woman.
Re: (Score:2)
Ah, reminds me of a story by a certain Mr. Asimov
---This gentlemen, is Myron Aub---
Re: (Score:2)
I'm not sure copyright would give you the protection you'd want. Copyrighting an algorithm is almost impossible, depending on which national legal jurisdiction you're in. And patenting could be expensive, and that too is not full-proof either (especially for a little guy like myself).
Personally, I'd offer this as a paid service online, and I'd let whichever government had jurisdiction over me -- buy me out (before they just confiscate it away from me without proper compensation).
Re: (Score:2)
On second thought, if I had the software for doing that, I'd offer it as a paid service online, but then I'd pretend I had a thousand low wage workers in India printing out each little strip of paper and reassembling it painstakingly by hand. This way I could count each worker as a separate contractor on my invoices and charge a corresponding large commission for each.
Re: (Score:3)
Copyrighting an algorithm is almost impossible, depending on which national legal jurisdiction you're in. And patenting could be expensive
Oh yes, we developed this lovely little algorithm, we want to patent it. Can someone spot us $50k to pay for patent bills?
Hey, maybe that's where you could use that $50k in prize money?
Re: (Score:2)
only 50k for a problem that complex? If you could solve this problem, I say copyright and make millions off of the algorithm.
Or it could be like a paper on pursuit curves that gets classified quickly.
In Falcon and Snowman, there was a scene of paper and water being put in a blender to shred the paper.
Huh... complex problem!? (Score:5, Interesting)
Re: (Score:2)
Except crowdsourcing isn't really an algorithm. You're just getting thousands of eyeballs helping to mix/match the piece like a giant jigsaw puzzle. Not exactly something you can sell as a product.
Re: (Score:2)
Doesn't scale (Score:4, Insightful)
The rules should require that the same method that solved the initial puzzle be successfully applied to 10 more shredded documents, to weed out methods that don't scale.
Re:Doesn't scale (Score:5, Informative)
Why 10 and not 4?
(I ask, because the contest requires 4 progressively harder documents be solved, with a declaration attached that says this is explicitly to filter out any methods that won't scale).
Re: (Score:2)
actually 5 progressively harder puzzles. I count some 6k pieces in the last puzzle in 20 Tiff files.
An interesting problem and I am making progress.
Re: (Score:2)
Sorry, to be clear I was saying 4 have to be solved after the first one, in order to frame it the same way as the parent.
10,000 documents for $50,000 reward? (Score:4, Funny)
Re:10,000 documents for $50,000 reward? (Score:5, Interesting)
Re: (Score:2)
If there is an offline version of this, it involves a garbage bag full of shredded 5$ bills and some scotch tape.
Yeah, but the question is whether they shred each bag separately, or more likely shred the notes en masse, jumble them up, then fill each bag with 5lbs of arbitrary shredded bits. That'd mean that unless each note was lined up with the shredding edges identically you'd be unlikely to have all the bits required to complete the vast majority of notes.
(I know, I'm overthinking the reply to what was a joke!)
For those who didn't get the reference [moneyfactorystore.gov].
Do you reckon this is why they're charging a whopping $45 for what is otherwise (and still is, really)
Shredding vs. burning (Score:3, Interesting)
I never really understood the purpose of shredding documents. If your documents are that sensitive, why not just burn them, leaving no trace of legible text? It seems like it would be cheaper, easier and faster too. Just throw them in a barrel outside, put a little lighter fluid in, and drop a match. Why is this not common?
Re: (Score:3)
Re:Shredding vs. burning (Score:4, Interesting)
Indeed that is what became of classified material I have dealt with. Shredded using a military cross-cut shedder (output pieces smaller than 1x10mm), mixed thoroughly, and then incinerated using a purpose built belt-fed, gas fired machine.
Re: (Score:3)
Indeed that is what became of classified material I have dealt with. Shredded using a military cross-cut shedder (output pieces smaller than 1x10mm), mixed thoroughly, and then incinerated using a purpose built belt-fed, gas fired machine.
I bought a cheap home shredder about a year ago, and it crosscuts. Makes reassembly unimaginably more difficult. (I think mine produces more like 2mm wide, but still.)
And if you don't have an incinerator, just pour the crosscut confetti into a recycle bin where all your other documents go. If you think reassembling one document would be difficult, consider starting from a bucket where the scraps of dozens or hundreds of documents are mixed indiscriminately.
Re: (Score:3)
Indeed that is what became of classified material I have dealt with. Shredded using a military cross-cut shedder (output pieces smaller than 1x10mm), mixed thoroughly, and then incinerated using a purpose built belt-fed, gas fired machine.
Actually, a quick check of online regs [doe.gov] states that the maximum size must be 1mm x 5mm. When you use an approved shredder, the pieces are very small, producing thousands of bits per page. The magnitude of this challenge is huge.
In some cases the challenge will be to determine just which side is up. If the document was double sided, then the order of difficulty will increase greatly.
Re:Shredding vs. burning (Score:5, Informative)
I never really understood the purpose of shredding documents. If your documents are that sensitive, why not just burn them, leaving no trace of legible text? It seems like it would be cheaper, easier and faster too.
What happens is that the top and botom pages and edges get scorched, but the middle part with the print remains largely intact.
Just throw them in a barrel outside, put a little lighter fluid in, and drop a match. Why is this not common?
Thus speaks someone who hasn't tried to burn more than a couple of sheets of paper.
It takes time to burn more than a few pages at a time. Or an extremely hot fire. Sorry, Mr Bradbury, 451 F won't do it, unless you can wait for weeks.
Re: (Score:2)
I have operated a purpose built incinerator designed to burn documents. We collected the material in burn bags, operators were encouraged to crumple up the paper. Each night, the bags were fed into the incinerator, which used diesel fuel to start the burn. A couple of time per hours, the ashes were mixed, and more bags introduced. It took time, but I can assure you, that all of the docs were destroyed each night.
Eventually, environmental issues shut down the incinerators, and we moved to shredding. It
Re: (Score:2)
What happens is that the top and botom pages and edges get scorched, but the middle part with the print remains largely intact.
Problem solved [amazon.com].
Re:Shredding vs. burning (Score:5, Informative)
1. Burning is inconvenient for small volumes of paper.
2. Burning is essentially illegal for large volumes of paper (business scale; Clean Air Act permits).
3. Fireplaces are not as common as they used to be; outdoor burning is illegal in most cities.
4. People can be idiots [insweb.com] when using fire outside of a fireplace or permanent fire pit.
5. DIOXIN! [ny.gov]
Shredding is like a residential door lock -- good enough to discourage a casual person who is too curious for their own good. Secure commercial shredders rely upon sheer volume and decent mixing (300 "particles" per page x 3 tons of paper dumped at a recycler is a decent level of obscurity) or "hydro-pulping" for the demanding (shred then pulp at paper mill -- good luck reassembling the fibers even if you get to them before bleaching).
Re: (Score:2)
Re: (Score:2)
Why not roll a group of documents up as tight as possible into a cylinder and somehow automatically feed that against some 40 grit sand paper.
Re: (Score:2)
Re: (Score:2)
Why are the documents shredded to begin with? (Score:5, Insightful)
Re: (Score:2)
Don't the warlords have access to fire? I'm pretty sure that brings about a thoroughly unrecoverable destruction of the documents...
Impractical: I am pretty sure that most offices where this would actually be used have rules against lighting fires indoors. Shredding provides a way to dispose of any document in any circumstance.
Re: (Score:3)
Impractical: I am pretty sure that most offices where this would actually be used have rules against lighting fires indoors.
And rules like that are so important to follow when the enemy is at the gates. Make sure you wipe your feet too, so they won't come to a dirty floor.
Re: (Score:2)
Don't the warlords have access to fire? I'm pretty sure that brings about a thoroughly unrecoverable destruction of the documents...
Impractical: I am pretty sure that most offices where this would actually be used have rules against lighting fires indoors. Shredding provides a way to dispose of any document in any circumstance.
If we're talking about warlords and other such types, who don't see a problem with using rape as a tool for war, I don't think they would be that worried about lighting a fire in a place where others might think it to be less-than-kosher. Hell if a warlord is trying to run away from something, they may well just set the entire building alight in hopes that the documents and everything else inside will go up in smoke.
SHHH!! (Score:5, Funny)
Everyone in the civilized world is worried about what will happen if terrorists gain access to this technology. That's why most nations have signed the Fire Non-Proliferation Treaty, and it's why the International Combustive Energy Agency is working round-the-clock to keep this technology from falling into the wrong hands (while somehow also promoting civilian use of combustive energy).
You've got to be a lot more careful about talking about such restricted technology and its possible uses.
Re:SHHH!! (Score:5, Funny)
See also United States v. Prometheus for more about the penalties for divulging such classified information.
Re: (Score:2)
Re: (Score:2)
(*) named Henery.
Re: (Score:3)
Eternal aquiline palinauxohepatectomy (eagle-based[aquiline] removal [ectomy] of a regrowing[palinauxo-] liver[hepat-]).
Re: (Score:2)
I think you'd have gotten more funny mods without the explanation.
Re: (Score:2)
Only when combined with agorahomopyronecrobestiality.
Re: (Score:2)
The major problem in the plans to keep fire out or terroist hands is the standard practice of dropping incendiaries on their houses, which essentiall gives them fire.
They are working on plans to combine blimp technology and water for a more childish approach. We will have to wait and see how that works out. It may spur innovation in the area of massive lift fans to lift the ballon and transport it over the hundreds of miles to target. Or the worlds lagest catapult. Either way new science and technology will
Re: (Score:2)
Only if you do it right. A sloppy burn job leads to entire pages of recoverable data. A confetti cut shredder will make the data damn near unrecoverable no matter how the paper is fed in.
Re: (Score:2)
Why not burn your shredded documents?
Time.
But tossing the shred out the window ought to do it. With even a small percentage of the shred unrecoverable, the puzzle becomes a lot harder.
Re: (Score:2)
Re:Why are the documents shredded to begin with? (Score:5, Informative)
Try burning shredded paper (Score:2)
Re: (Score:2)
You can shred the documents, make some briquettes, and put them on the bonfire come Bonfire Night. Or, if you have a wood burning stove or hearth, you get free fuel.
Re: (Score:3, Funny)
Re:Why are the documents shredded to begin with? (Score:4, Funny)
>>Don't worry, the next contest will involve a $75,000 prize to reverse entropy
I hear students from UCSD have already summoned a demon to solve this puzzle.
Name's Maxwell, something like that...
Confused? (Score:5, Insightful)
Is it just me or does this make little to no sense.
You cannot scale putting together puzzle pieces because the same person needs to both see two pieces that go together and recognize that they match.
So yes more people help, but if there are 10 million pieces then the average person would have to look at over 1 million pieces before they have even seen two that go together.
And this seems like a very easy thing to computise.
You digitize the shredded documents.
You run a program that looks for similarities around the edges.
You stick likely candidates together and either ask for human confirmation or run a text recognition algorithm to see if the result makes sense.
Now this becomes harder if the direct edge of many of the shredded parts are blank, but still more then doable if you use spacing recognition(calc how big a space is in this document and look for the correspond amount of missing space on the other side), line up the text rows, and some basic word statistic (if you see "he ...", for example you are likely looking for a "T" on the right side of another strip).
Re: (Score:3)
If it were so easy to computise, why haven't you done it yet and taken the prize?
My guess it's not that easy. And that it also doesn't have to do with computing horsepower as such.
Then about the text recognition and analyses: don't forget that there are more languages than just English. As a matter of fact most people in this world use a language other than English in their daily life. I for one use four languages, of which three daily and the fourth at least weekly. And English is my second language. You
Re: (Score:2)
I think the idea is that you use the software to churn out tons of small patches of potential matches, which then get passed out to the humans for verification. If the humans score the patch highly, those used pieces are considered spent, and down-weighted from any further matches, while the software bumps up to the next level and starts weaving together the larger patches.
Since the software is only making small patches, the number of combinations stays within manageable levels, and the humans are better a
Re: (Score:3)
I'm fairly sure this problem is NP-complete, which makes it anything BUT trivial to compute. It might be easy to represent computationally, but to actually calculate the result is extremely hard. In fact, finding an efficient algorithm for it would make you incredibly rich and possibly dead.
Re: (Score:2)
It very well might be NP-complete (but probably more of a fuzzy NP complete as there a certain assumptions you can make about the content and it cannot be compressed down to a purely simple mathematical problem) but I think that if that were so that it would be NP-complete for humans as well as there is no best guess and good enough solution. And no human could even hold enough of the puzzle in their head to attempt any kind of effective solution.
So yes it might be hard to solve in a reasonable time with a
Re: (Score:2)
Re: (Score:2)
I would like evidence for the claim that you can check the fit of two pieces in constant time. This would seem to be the primary difficulty involved in the contest, if it were straightforwardly clear that you could do it in constant time I think the contest would be over.
Re: (Score:3)
Re:Confused? (Score:4, Insightful)
Re: (Score:2)
For N items, there are N! ways to arrange them. That doesn't make sorting an N! problem.
Stop banging so much, I have a headache.
Re: (Score:2)
Re: (Score:2)
Would it be possible to get information from the scan beyond that of the text itself? Perhaps each cut has a unique edge, so you could line up the columns of likely candidates. Or maybe the grain in the paper could be revealed, adding another potential edge. Maybe thickness of the paper could be compared or opaqueness. Put all this extra information together and maybe a computer could then work on the construction of the actual text.
Re: (Score:2)
I agree some computer pre-sorting is needed to pare the problem down a bit first.
If this involves multiple pages, perhaps the computer can distinguish which pieces belong to which pages based on the angle of cut versus the font? Or top face versus bottom face. I doubt every piece goes through the shredder exactly the same angle. You'd need pieces large enough to determine the font angle with respect to the edges
Each cutting blade and cross-cut tooth isn't identical. It may be possible to distinguish what
Re: (Score:2)
Looking for similarities around the edges breaks down when most of the edges look very similar (each edge might be a good match for hundreds of other pieces). Asking for human confirmation on tens of thousands of samples requires a lot of patience, and with such small pieces, it may even be difficult for a human to judge.
The last puzzle looks really challenging. It's clear that there are bits missing (even sub-bits of pieces), and some curled or torn edges on some of the shreds.
Re: (Score:2)
In some way it might be analogous to chess in the sense that the immediate effect of individual moves are easy for a computer to consider but i
Re: Confused? (Score:2)
And this seems like a very easy thing to computise. You digitize the shredded documents. You run a program that looks for similarities around the edges. You stick likely candidates together and either ask for human confirmation or run a text recognition algorithm to see if the result makes sense.
This sort of approach has been used before, as far back as 1969, as described in this excerpt from an issue Popular Mechanics [google.ca]:
The job of reassembling 30,000 pieces of an Egyptian temple at Karnak is being given an assist by an IBM computer... The pieces are coded and photographed, and the photos matched with the help of the computer.
More recently, software developed at Tel Aviv University [bloomberg.com] is being used to piece together thousands of hand-written document fragments.
Anyone else... (Score:3)
...remember the days when /. had actual editors that could catch related or duplicate summaries [slashdot.org] and either tie them together or throw them out? No? Me either.
Occupy Wall Street (Score:2)
I think that site just harvested my email address (Score:2)
Statistics Fail (Score:3)
Re: (Score:2)
So, you don't think there can be a correlation between expertise and odds of winning in a contest that is NOT DECIDED BY RANDOM CHANCE?
Talk about Statistics Fail.
Re: (Score:2)
Or do you expect him to start again from scratch, forgetting everything he learned the first time around? Proven solutions are just too easy, right?
Most Complicated Ever? (Score:2)
solution to shredding doesn't help (Score:2)
it takes less effort, and less time, and less technology, to burn documents than to shred them. If shredding ceases to become useful, it'll take eight seconds before the new fangled algorithm will be useless.
How about the Human Genome Project? (Score:2)
Troops? (Score:2)
>> which are often confiscated by troops in war zones
Translation: confiscated by the IRS when visiting your office.
Problem already solved (Score:4, Informative)
As far as I know, German Fraunhofer Institute has a solution for this kind of problem: http://www.ipk.fraunhofer.de/component/content/category/167-autsicherheitstechnikstasischnipsel [fraunhofer.de] (p.8ff, German language).
Looks like they have few problems assembling torn pages, and geometrically correct results for shredded paper (yet not necessarily correct content).
The real problem (Score:2)
The real problem the government is trying to solve is of course putting together the shredded pieces in such a way that suits them most :)
I guess that will be the next challenge.
A very redundant puzzle. (Score:2)
A German company solved this exact problem years ago, when trying to find a way to reconstruct documents of the former East German Staatssicherheit that had been shredded.
Oh, and they're not dealing 10000 pieces various documents, they're dealing with 10000 bags full of pieces of shredded documents. Crowd-source that.
Re: (Score:2)
Re: (Score:3)
Re: (Score:3)
Re: (Score:2, Insightful)
Because spying on *US citizens* is the worst thing they could ever do.
First, obviously we non-US citizens just deserve to spied on. But that is not the purpose of the $50k challenge.
This is for captured documents after *invading* nations (namely, after killing the goverment workers and entering their buildings). This is not *defending* the Fath^H^H^H^HHomeland. It is for offensive warfare on foreign soil.
And "saving lives" in the article means obviously saving *US lives* (the lives of us proto-humans dwelli
Re: (Score:2)
Yeah it's funny, but I wonder, is it's a bit too subtle and edgy. We want people to get it, you know? Maybe you could slightly dumb it down? For the masses, you understand.
Re: (Score:2)
Re: (Score:2)
If you add them up, it comes out to slightly over 9500 pieces....
Well they didn't say all the pieces would be there. What's the likely hood that the shred bag you grabbed has every single pieces? How many are stuck in the cutter or got vacuumed off the floor (I've never seen a 1x5mm shredder that didn't leave a mess of chaff all around itself.
Re: (Score:2)
No, I'm fairly sure you'd also have to get it inducted into the contest somehow.