Become a fan of Slashdot on Facebook


Forgot your password?
AI News Science Technology

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition ( 50

A new PNAS paper takes a look at just how different computer and human visual systems are. Humans can figure out that a mangled word "meant" something recognizable while a computer can't. Likewise with images: humans can piece together what a blurry image might depict based on small clues in the picture, where a computer would be at a loss. The authors of the PNAS paper used a set of blurry, tricky images to pinpoint the differences between computer vision models and the human brain.

They used pictures called "minimal recognizable configurations" (MIRCs) that were either so small or so low-resolution that any further reduction would prevent a person from being able to recognize them. The computer models did a better job after they were trained specifically on the MIRCs, but their accuracy was still low compared to human performance. The reason for this, the authors suggest, is that computers can't pick out the individual components of the image whereas humans can. This kind of interpretation is "beyond the capacities of current neural network models," the authors write.
This discussion has been archived. No new comments can be posted.

Tiny, Blurry Pictures Find the Limits of Computer Image Recognition

Comments Filter:
  • Not one example? (Score:5, Informative)

    by nuckfuts ( 690967 ) on Saturday February 20, 2016 @02:23PM (#51549017)
    This story is rather lacking without a single example of what they're talking about.
    • by SeaFox ( 739806 ) on Saturday February 20, 2016 @02:38PM (#51549121)

      Yeah, the author should ENHANCE this story a bit.

      • Re:Not one example? (Score:5, Informative)

        by ShanghaiBill ( 739463 ) on Saturday February 20, 2016 @03:35PM (#51549371)

        Here [] is a page with some examples.

        Here [] is a PDF of the paper, which has more examples.

        I don't think it means much. Instead of showing that humans see better than computers, it really just shows that this one researcher is bad at programming computer vision systems. If he took his dataset, and made it a Kaggle Competition [], I think someone would design a computer vision system that would do much better than his.

        • by Kjella ( 173770 )

          Perhaps... but it looks like a tough act to follow. By decomposing the picture he shows how many different characteristics we use in combination on an ordinary image. The sharp drop-off shows how we latch on to one small defining feature and work our way backwards to the answer. Or maybe it's easier to argue in reverse, this here blob looks like anything. Add a bow here and it's a ship. Extract the neck up, it's a horse. Show a chin here, it's a suit. Several times it's about seeing the edge and thus then e

          • But the kind of "decompose and integrate" we see here would be rather impressive, like you don't compare "a horse" to other horses. You actually divide it up and say it has a horse's head, a horse's neck, a horse's legs, a horse's overall shape, they're all like voting for whether it's a horse or not

            That is actually how convolutional neural networks work. They basically decompose the image into sub-images, and then vote. If a CNN was programmed for this task, and trained on plenty of data, I think it could do well, and very likely surpass human abilities. Just because these researchers were lousy at programming/training their NN, that does not mean NNs are fundamentally bad at it.

            I have done a lot of work in computer vision, and when I first used NNs back in the 1980s I was very unimpressed. They w

        • by AK Marc ( 707885 )
          The study doesn't look at false positives either. Throw in a picture of a brown paper bag, and the human will declare it an eye, while the computer will (correctly) not find it among the stated possibilities. Same with a squirrel called a horse. The human brain is really good at coming up with an answer, even if there isn't enough information for an answer. Computers are more likely to reject all answers, rather than settle on the wrong one because it seems more probable (at least the ones used in these
          • by wanax ( 46819 )

            They did consider false positives.. They had a catch category (see page 4, last par) and human's did extremely well at it (see 3rd par, page 5).

        • by wanax ( 46819 )

          I'm a professional neuroscientist that specializes in vision research with a computational bent. They used all the main stream, state of the art, openly available object recognition algorithms currently in use. Computer vision is a huge market, with many applications, from the DoD to self-driving cars to image-based searches. I doubt some 5-figure prize is going to out perform the best algorithms several distinct industries and academia have managed to create while being funded to the tune of over a billion

          • And this is why self driving cars today can't compete against distracted teenage drivers. For all the claims of how perfect computer vision and sensor fusion is, humans are far superior.
            • by djinn6 ( 1868030 )
              That's a completely different field though. Humans are so far above the minimum capability for driving, they put on music or radio to fight the boredom. Some go as far as doing makeup, eating, texting or calling other people. If everyone concentrated completely on driving, the accident rate would be practically zero. We might even be able to raise the speed limit by 50%.

              Software doesn't need to come close to human capability to be a great driver. It just need to be better than the 0.1% of the time when h
        • Not quite. The paper explains that the computer is different quantitatively than the humans. Specifically, computer vision systems degraded slowly with no hard cutoff, while human vision systems had hard cutoffs where a small degredation in the image (crop, blur) led to a big reduction in the number of accurate identifications on the Turk.

          This cliff was not present in computer vision models. Not just present at a more detailed level (ie, the computer failed earlier) but not present at all indicating that co

    • The story was full of those low resolution samples. They are just 1x1 images. And they're white.

    • when I was 13 and I liked it!
  • by Anonymous Coward

    The explanation is simple: context. We humans have many context information on our brains, very useful to infer knowledge from a wide range of noisy inputs (such as blurry pictures). If we train a computer to identify some aspects of blurry images *within specific context*, the computer will do a decent job.

  • Wonderful. Now I'll be forced to look at really time blurry pics every time I want to do anything on the web.
  • Link (Score:4, Informative)

    by Cow Jones ( 615566 ) on Saturday February 20, 2016 @02:44PM (#51549161)
    This seems to be the project the article is talking about: []
    • Re: (Score:2, Informative)

      by Anonymous Coward

      & paper

  • by Anonymous Coward

    Jerry Lettvin, in the 1960's did experiments on single optical nerve cells that showed how the retina itself enhances and discovers edges. Human vision is not a "pixel image", it's based on collecting and amplifying *edges* and differentials. Until the computer processing and the cameras, themselves, used for computer vision get this built in at the most basic levels of the CCD and immediate processing, a great deal of the most critical data is thrown out before any more sophisticated ""computer brain" can

  • by Anonymous Coward

    Uh-huh-huh-huh it says PNAS.

  • To recognize a bald eagle, I don't need to be fed with a million eagle pictures; show me one flying eagle once or may be two or three times, it's done. Human visual cortex must be using ways of rotating a 3-D object and projecting how the object will appear if viewed in 2-D from different angles; also it can do simple scaling (bigger/smaller); and how color changes can affect [grey-scale/color].

    Machine learning takes a million eagle pictures and does something of a curve-fitting to know how far a new poi
  • What the heck? I thought all you have to do is zoom and click 'enhance' and a computer can make a reasonably clear picture no matter how blurry the original.

  • ... meme []

  • This human's neural network detected zoomed in faces with black bars covering their eyes, presumably out of shame, while they struggle with and guzzle a long, pinkish-red, penis-shaped object in their mouth.

    PDF (Page 2): []

    Excuse me for a moment.

I've finally learned what "upward compatible" means. It means we get to keep all our old mistakes. -- Dennie van Tassel