Tiny, Blurry Pictures Find the Limits of Computer Image Recognition (arstechnica.com) 50

Posted by BeauHD on Saturday February 20, 2016 @02:19PM from the prescription-lenses dept.

A new PNAS paper takes a look at just how different computer and human visual systems are. Humans can figure out that a mangled word "meant" something recognizable while a computer can't. Likewise with images: humans can piece together what a blurry image might depict based on small clues in the picture, where a computer would be at a loss. The authors of the PNAS paper used a set of blurry, tricky images to pinpoint the differences between computer vision models and the human brain.

They used pictures called "minimal recognizable configurations" (MIRCs) that were either so small or so low-resolution that any further reduction would prevent a person from being able to recognize them. The computer models did a better job after they were trained specifically on the MIRCs, but their accuracy was still low compared to human performance. The reason for this, the authors suggest, is that computers can't pick out the individual components of the image whereas humans can. This kind of interpretation is "beyond the capacities of current neural network models," the authors write.

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 50 Comments Log In/Create an Account

Comments Filter:

Not one example? (Score:5, Informative)

by nuckfuts ( 690967 ) writes: on Saturday February 20, 2016 @02:23PM (#51549017)

This story is rather lacking without a single example of what they're talking about.

- Re:Not one example? (Score:4, Funny)
  
  by SeaFox ( 739806 ) writes: on Saturday February 20, 2016 @02:38PM (#51549121)
  
  Yeah, the author should ENHANCE this story a bit.
  
  - Re:Not one example? (Score:5, Informative)
    
    by ShanghaiBill ( 739463 ) writes: on Saturday February 20, 2016 @03:35PM (#51549371)
    
    Here [weizmann.ac.il] is a page with some examples.
    Here [pnas.org] is a PDF of the paper, which has more examples.
    I don't think it means much. Instead of showing that humans see better than computers, it really just shows that this one researcher is bad at programming computer vision systems. If he took his dataset, and made it a Kaggle Competition [kaggle.com], I think someone would design a computer vision system that would do much better than his.
    
    - Re: (Score:2)
      
      by Kjella ( 173770 ) writes:
      
      Perhaps... but it looks like a tough act to follow. By decomposing the picture he shows how many different characteristics we use in combination on an ordinary image. The sharp drop-off shows how we latch on to one small defining feature and work our way backwards to the answer. Or maybe it's easier to argue in reverse, this here blob looks like anything. Add a bow here and it's a ship. Extract the neck up, it's a horse. Show a chin here, it's a suit. Several times it's about seeing the edge and thus then e
      - Re: (Score:2)
        
        by ShanghaiBill ( 739463 ) writes:
        
        But the kind of "decompose and integrate" we see here would be rather impressive, like you don't compare "a horse" to other horses. You actually divide it up and say it has a horse's head, a horse's neck, a horse's legs, a horse's overall shape, they're all like voting for whether it's a horse or not
        That is actually how convolutional neural networks work. They basically decompose the image into sub-images, and then vote. If a CNN was programmed for this task, and trained on plenty of data, I think it could do well, and very likely surpass human abilities. Just because these researchers were lousy at programming/training their NN, that does not mean NNs are fundamentally bad at it.
        I have done a lot of work in computer vision, and when I first used NNs back in the 1980s I was very unimpressed. They w
    - Re: (Score:2)
      
      by AK Marc ( 707885 ) writes:
      
      The study doesn't look at false positives either. Throw in a picture of a brown paper bag, and the human will declare it an eye, while the computer will (correctly) not find it among the stated possibilities. Same with a squirrel called a horse. The human brain is really good at coming up with an answer, even if there isn't enough information for an answer. Computers are more likely to reject all answers, rather than settle on the wrong one because it seems more probable (at least the ones used in these
      - Re: (Score:2)
        
        by wanax ( 46819 ) writes:
        
        They did consider false positives.. They had a catch category (see page 4, last par) and human's did extremely well at it (see 3rd par, page 5).
    - Re: (Score:3)
      
      by wanax ( 46819 ) writes:
      
      I'm a professional neuroscientist that specializes in vision research with a computational bent. They used all the main stream, state of the art, openly available object recognition algorithms currently in use. Computer vision is a huge market, with many applications, from the DoD to self-driving cars to image-based searches. I doubt some 5-figure prize is going to out perform the best algorithms several distinct industries and academia have managed to create while being funded to the tune of over a billion
      - Re: (Score:2)
        
        by burtosis ( 1124179 ) writes:
        
        And this is why self driving cars today can't compete against distracted teenage drivers. For all the claims of how perfect computer vision and sensor fusion is, humans are far superior.
        
        Re: (Score:1)
        
        by djinn6 ( 1868030 ) writes:
        
        That's a completely different field though. Humans are so far above the minimum capability for driving, they put on music or radio to fight the boredom. Some go as far as doing makeup, eating, texting or calling other people. If everyone concentrated completely on driving, the accident rate would be practically zero. We might even be able to raise the speed limit by 50%.
        
        Software doesn't need to come close to human capability to be a great driver. It just need to be better than the 0.1% of the time when h
        
        Re: (Score:2)
        
        by plover ( 150551 ) writes:
        
        Clearly you have never seen my uncle drive. Despite a lifetime of practice, at no point in his life could he have ever bested any of the current self-driving cars, never mind the advances we'll likely see in the next decade.
        Keep in mind there are still people on the roads who hold licenses that were granted before driving tests were required.
    - Re: (Score:2)
      
      by The Raven ( 30575 ) writes:
      
      Not quite. The paper explains that the computer is different quantitatively than the humans. Specifically, computer vision systems degraded slowly with no hard cutoff, while human vision systems had hard cutoffs where a small degredation in the image (crop, blur) led to a big reduction in the number of accurate identifications on the Turk.
      This cliff was not present in computer vision models. Not just present at a more detailed level (ie, the computer failed earlier) but not present at all indicating that co
- Re: (Score:3)
  
  by Dan East ( 318230 ) writes:
  
  The story was full of those low resolution samples. They are just 1x1 images. And they're white.
- I did the same thing (Score:3)
  
  by rsilvergun ( 571051 ) writes:
  
  when I was 13 and I liked it!
Context (Score:1)

by Anonymous Coward writes:

The explanation is simple: context. We humans have many context information on our brains, very useful to infer knowledge from a wide range of noisy inputs (such as blurry pictures). If we train a computer to identify some aspects of blurry images *within specific context*, the computer will do a decent job.
Wonderful (Score:1)

by Andurian ( 1162629 ) writes:

Wonderful. Now I'll be forced to look at really time blurry pics every time I want to do anything on the web.
Link (Score:4, Informative)

by Cow Jones ( 615566 ) writes: on Saturday February 20, 2016 @02:44PM (#51549161)

This seems to be the project the article is talking about: http://www.wisdom.weizmann.ac.il/~dannyh/Mircs/mircs.html [weizmann.ac.il]

- Re: (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  & paper http://www.pnas.org/content/early/2016/02/09/1513198113.full.pdf
It's the retina, not the brain (Score:2, Insightful)

by Anonymous Coward writes:

Jerry Lettvin, in the 1960's did experiments on single optical nerve cells that showed how the retina itself enhances and discovers edges. Human vision is not a "pixel image", it's based on collecting and amplifying *edges* and differentials. Until the computer processing and the cameras, themselves, used for computer vision get this built in at the most basic levels of the CCD and immediate processing, a great deal of the most critical data is thrown out before any more sophisticated ""computer brain" can
- Re: (Score:2)
  
  by plover ( 150551 ) writes:
  
  The problem is with the algorithms used, not the capabilities of computers. If done right, for this specific task at hand, a computer would beat a human every time. For example, a computer looking at a 2x2 pixel square image of a letter could compare it against what it knows every character scaled down to 2x2 looks like under various scaling algorithms, the brightness levels of the four available pixels, and tell you with very high accuracy what it's looking at. A human, on the other hand, would have no clue.
  For a specific task, sure, you can do all kinds of computer optimizations to make the recognition easier. But the experiment you are describing isn't valid in the general case where you have no idea what the context is. Have a look at the paper. These fragments of pictures could be the letter "Y" rendered in Arial Ultra Bold Italic on a white sign at twilight, an eagle in flight across a blue sky, or an X-ray of an artery. With nobody to say "this is from a font directory", or "this photo was taken outs
PNAS (Score:1)

by Anonymous Coward writes:

Uh-huh-huh-huh it says PNAS.
I don't see what the problem is (Score:1)

by Stephen Gilbert ( 554986 ) writes:

Just zoom and enhance.
human recognition is very different (Score:1)

by yes-but-no ( 4133651 ) writes:

To recognize a bald eagle, I don't need to be fed with a million eagle pictures; show me one flying eagle once or may be two or three times, it's done. Human visual cortex must be using ways of rotating a 3-D object and projecting how the object will appear if viewed in 2-D from different angles; also it can do simple scaling (bigger/smaller); and how color changes can affect [grey-scale/color].

Machine learning takes a million eagle pictures and does something of a curve-fitting to know how far a new poi
What is too blurry for the computer? (Score:2)

by magarity ( 164372 ) writes:

What the heck? I thought all you have to do is zoom and click 'enhance' and a computer can make a reasonably clear picture no matter how blurry the original.
Image board ... (Score:2)

by PPH ( 736903 ) writes:

... meme [i.lvme.me]
So what I saw.. (Score:1)

by Smirker ( 695167 ) writes:

This human's neural network detected zoomed in faces with black bars covering their eyes, presumably out of shame, while they struggle with and guzzle a long, pinkish-red, penis-shaped object in their mouth.
PDF (Page 2): http://www.pnas.org/content/ea... [pnas.org]
Excuse me for a moment.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition (arstechnica.com) 50

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition More Login

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition

Not one example? (Score:5, Informative)

Re:Not one example? (Score:4, Funny)

Re:Not one example? (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

I did the same thing (Score:3)

Context (Score:1)

Wonderful (Score:1)

Link (Score:4, Informative)

Re: (Score:2, Informative)

It's the retina, not the brain (Score:2, Insightful)

Re: (Score:2)

PNAS (Score:1)

I don't see what the problem is (Score:1)

human recognition is very different (Score:1)

What is too blurry for the computer? (Score:2)

Image board ... (Score:2)

So what I saw.. (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot