Tiny, Blurry Pictures Find the Limits of Computer Image Recognition (arstechnica.com) 50
A new PNAS paper takes a look at just how different computer and human visual systems are. Humans can figure out that a mangled word "meant" something recognizable while a computer can't. Likewise with images: humans can piece together what a blurry image might depict based on small clues in the picture, where a computer would be at a loss. The authors of the PNAS paper used a set of blurry, tricky images to pinpoint the differences between computer vision models and the human brain.
They used pictures called "minimal recognizable configurations" (MIRCs) that were either so small or so low-resolution that any further reduction would prevent a person from being able to recognize them. The computer models did a better job after they were trained specifically on the MIRCs, but their accuracy was still low compared to human performance. The reason for this, the authors suggest, is that computers can't pick out the individual components of the image whereas humans can. This kind of interpretation is "beyond the capacities of current neural network models," the authors write.
They used pictures called "minimal recognizable configurations" (MIRCs) that were either so small or so low-resolution that any further reduction would prevent a person from being able to recognize them. The computer models did a better job after they were trained specifically on the MIRCs, but their accuracy was still low compared to human performance. The reason for this, the authors suggest, is that computers can't pick out the individual components of the image whereas humans can. This kind of interpretation is "beyond the capacities of current neural network models," the authors write.
Not one example? (Score:5, Informative)
Re:Not one example? (Score:4, Funny)
Yeah, the author should ENHANCE this story a bit.
Re:Not one example? (Score:5, Informative)
Here [weizmann.ac.il] is a page with some examples.
Here [pnas.org] is a PDF of the paper, which has more examples.
I don't think it means much. Instead of showing that humans see better than computers, it really just shows that this one researcher is bad at programming computer vision systems. If he took his dataset, and made it a Kaggle Competition [kaggle.com], I think someone would design a computer vision system that would do much better than his.
Re: (Score:2)
Perhaps... but it looks like a tough act to follow. By decomposing the picture he shows how many different characteristics we use in combination on an ordinary image. The sharp drop-off shows how we latch on to one small defining feature and work our way backwards to the answer. Or maybe it's easier to argue in reverse, this here blob looks like anything. Add a bow here and it's a ship. Extract the neck up, it's a horse. Show a chin here, it's a suit. Several times it's about seeing the edge and thus then e
Re: (Score:2)
But the kind of "decompose and integrate" we see here would be rather impressive, like you don't compare "a horse" to other horses. You actually divide it up and say it has a horse's head, a horse's neck, a horse's legs, a horse's overall shape, they're all like voting for whether it's a horse or not
That is actually how convolutional neural networks work. They basically decompose the image into sub-images, and then vote. If a CNN was programmed for this task, and trained on plenty of data, I think it could do well, and very likely surpass human abilities. Just because these researchers were lousy at programming/training their NN, that does not mean NNs are fundamentally bad at it.
I have done a lot of work in computer vision, and when I first used NNs back in the 1980s I was very unimpressed. They w
Re: (Score:2)
Re: (Score:2)
They did consider false positives.. They had a catch category (see page 4, last par) and human's did extremely well at it (see 3rd par, page 5).
Re: (Score:3)
I'm a professional neuroscientist that specializes in vision research with a computational bent. They used all the main stream, state of the art, openly available object recognition algorithms currently in use. Computer vision is a huge market, with many applications, from the DoD to self-driving cars to image-based searches. I doubt some 5-figure prize is going to out perform the best algorithms several distinct industries and academia have managed to create while being funded to the tune of over a billion
Re: (Score:2)
Re: (Score:1)
Software doesn't need to come close to human capability to be a great driver. It just need to be better than the 0.1% of the time when h
Re: (Score:2)
Clearly you have never seen my uncle drive. Despite a lifetime of practice, at no point in his life could he have ever bested any of the current self-driving cars, never mind the advances we'll likely see in the next decade.
Keep in mind there are still people on the roads who hold licenses that were granted before driving tests were required.
Re: (Score:2)
Not quite. The paper explains that the computer is different quantitatively than the humans. Specifically, computer vision systems degraded slowly with no hard cutoff, while human vision systems had hard cutoffs where a small degredation in the image (crop, blur) led to a big reduction in the number of accurate identifications on the Turk.
This cliff was not present in computer vision models. Not just present at a more detailed level (ie, the computer failed earlier) but not present at all indicating that co
Re: (Score:3)
The story was full of those low resolution samples. They are just 1x1 images. And they're white.
I did the same thing (Score:3)
Context (Score:1)
The explanation is simple: context. We humans have many context information on our brains, very useful to infer knowledge from a wide range of noisy inputs (such as blurry pictures). If we train a computer to identify some aspects of blurry images *within specific context*, the computer will do a decent job.
Wonderful (Score:1)
Link (Score:4, Informative)
Re: (Score:2, Informative)
& paper http://www.pnas.org/content/early/2016/02/09/1513198113.full.pdf
It's the retina, not the brain (Score:2, Insightful)
Jerry Lettvin, in the 1960's did experiments on single optical nerve cells that showed how the retina itself enhances and discovers edges. Human vision is not a "pixel image", it's based on collecting and amplifying *edges* and differentials. Until the computer processing and the cameras, themselves, used for computer vision get this built in at the most basic levels of the CCD and immediate processing, a great deal of the most critical data is thrown out before any more sophisticated ""computer brain" can
Re: (Score:2)
The problem is with the algorithms used, not the capabilities of computers. If done right, for this specific task at hand, a computer would beat a human every time. For example, a computer looking at a 2x2 pixel square image of a letter could compare it against what it knows every character scaled down to 2x2 looks like under various scaling algorithms, the brightness levels of the four available pixels, and tell you with very high accuracy what it's looking at. A human, on the other hand, would have no clue.
For a specific task, sure, you can do all kinds of computer optimizations to make the recognition easier. But the experiment you are describing isn't valid in the general case where you have no idea what the context is. Have a look at the paper. These fragments of pictures could be the letter "Y" rendered in Arial Ultra Bold Italic on a white sign at twilight, an eagle in flight across a blue sky, or an X-ray of an artery. With nobody to say "this is from a font directory", or "this photo was taken outs
PNAS (Score:1)
Uh-huh-huh-huh it says PNAS.
I don't see what the problem is (Score:1)
human recognition is very different (Score:1)
Machine learning takes a million eagle pictures and does something of a curve-fitting to know how far a new poi
What is too blurry for the computer? (Score:2)
What the heck? I thought all you have to do is zoom and click 'enhance' and a computer can make a reasonably clear picture no matter how blurry the original.
Image board ... (Score:2)
So what I saw.. (Score:1)
This human's neural network detected zoomed in faces with black bars covering their eyes, presumably out of shame, while they struggle with and guzzle a long, pinkish-red, penis-shaped object in their mouth.
PDF (Page 2): http://www.pnas.org/content/ea... [pnas.org]
Excuse me for a moment.