No Bones About It: People Recognize Objects By Visualizing Their 'Skeletons' (scientificamerican.com) 49
An anonymous reader shares a report from Scientific American: Humans effortlessly know that a tree is a tree and a dog is a dog no matter the size, color or angle at which they're viewed. In fact, identifying such visual elements is one of the earliest tasks children learn. But researchers have struggled to determine how the brain does this simple evaluation. As deep-learning systems have come to master this ability, scientists have started to ask whether computers analyze data -- and particularly images -- similarly to the human brain. "The way that the human mind, the human visual system, understands shape is a mystery that has baffled people for many generations, partly because it is so intuitive and yet it's very difficult to program" says Jacob Feldman, a psychology professor at Rutgers University.
A paper published in Scientific Reports in June comparing various object recognition models came to the conclusion that people do not evaluate an object like a computer processing pixels, but based on an imagined internal skeleton. In the study, researchers from Emory University, led by associate professor of psychology Stella Lourenco, wanted to know if people judged object similarity based on the objects' skeletons -- an invisible axis below the surface that runs through the middle of the object's shape. The scientists generated 150 unique three-dimensional shapes built around 30 different skeletons and asked participants to determine whether or not two of the objects were the same. Sure enough, the more similar the skeletons were, the more likely participants were to label the objects as the same. The researchers also compared how well other models, such as neural networks (artificial intelligence-based systems) and pixel-based evaluations of the objects, predicted people's decisions. While the other models matched performance on the task relatively well, the skeletal model always won. On the Rumsfeld Epistemological Scale, AI programers trying to duplicate the functions of the human mind are still dealing with some high-level known-unknowns, and maybe even a few unknown-unknowns.
A paper published in Scientific Reports in June comparing various object recognition models came to the conclusion that people do not evaluate an object like a computer processing pixels, but based on an imagined internal skeleton. In the study, researchers from Emory University, led by associate professor of psychology Stella Lourenco, wanted to know if people judged object similarity based on the objects' skeletons -- an invisible axis below the surface that runs through the middle of the object's shape. The scientists generated 150 unique three-dimensional shapes built around 30 different skeletons and asked participants to determine whether or not two of the objects were the same. Sure enough, the more similar the skeletons were, the more likely participants were to label the objects as the same. The researchers also compared how well other models, such as neural networks (artificial intelligence-based systems) and pixel-based evaluations of the objects, predicted people's decisions. While the other models matched performance on the task relatively well, the skeletal model always won. On the Rumsfeld Epistemological Scale, AI programers trying to duplicate the functions of the human mind are still dealing with some high-level known-unknowns, and maybe even a few unknown-unknowns.
prove it (Score:3)
Now the only need to prove that people use AI models that are designed to "function like the human mind".
Re: (Score:1)
This is News? (Score:5, Interesting)
Re: (Score:3)
No one is claiming that people consciously visualise a skeleton when trying to work out what an object in the distance is, just like no one consciously thinks about what the colour 'red' is when they view something red.
Re: (Score:2)
>how do I know my blue is the same as someone else's blue?
My OCD has been doing this from a young age as well. I eventually did find an article about it, but I can't find it right now.
Stick figure (Score:5, Insightful)
If you replace "skeleton" with "stick figure" in the summary it might seem more familiar.
You easily recognize a stick figure of a human vs a dog vs a tree - even at the age of 2 or 3. Whether the arms or up or down whatever - the position can vary to any somewhat normal position for the person or dog or car to be in and you immediately recognize it as a person or whatever it is.
A stick figure is, of course, the same thing visually as a simplistic skeleton.
Re: (Score:2)
Re:Stick figure (Score:5, Interesting)
Yes, "Skeleton" is an odd term here.
The "hardware level" of visual processing in most animals picks out movement and straight-ish lines very early in the process. It should be no surprise, given this neurology, that object recognition builds on that foundation: the movement of straight lines across our visual field.
Straight lines that are connected and move together, e.g. "stick figures" but also "a bunch of trees" or "a box", are basic building blocks of understanding what we see.
Fun fact, a frog can only see a small flying object if it's about the right size to be food, and about the right speed to be food. Everything else gets filtered out, so the frog can simply try to eat anything flying it sees, no further judgement required.
Re: (Score:2)
What about this dog? (Score:2)
This one doesn't look like it has any skellington to speak of [fandom.com].
Re: (Score:2)
I see seven recognizable "bones" in the illustrations of Zero on that page. One runs from the nose to the back of the head, another up the jaw, and another on each ear. There are also three on the sheet that makes up Zero's body: one down the back and one to each front corner.
Re: (Score:2)
It isn't really "bones" they look at, but imaginary lines drawn down the center of the shape. At least that's what I imagine. I didn't bother to look. It seems doubtful that is how humans recognize objects. It is probably a combination of outline + "bones"
No shit (Score:2)
came to the conclusion that people do not evaluate an object like a computer processing pixels
If you do work or research based on the assumption that they do, you should probably stop. You are wasting someones money by either being amateurish and/or ignorant, or a shill.
While sometimes useful and effective, I doubt that the people inventing the clever image recognition algorithms believe that it works the same way as humans.
Re: No shit (Score:2)
Re: (Score:2)
You can't make neural networks like brains, because they are just computer programs which some idiot dubbed "neural networks" to fool people a long time ago. That is like saying you want to make the sky more like a cat. Neural Nets are nothing like neural networks.
Phrasing (Score:1)
Re: (Score:3)
Wireframes model the surface shape of the object. That's not what this article says our brains do.
Re: (Score:2)
Whoosh. I get that. But you don't actually have x-ray vision; your brain is constructing the underlying skeletal model out of visual (light) information reflected off the face you're seeing.
This was a comment about how my wife *doesn't* have the mechanism mentioned in the article, and it causes us to disagree on the similarity of faces. She sees surface details only, and not well-organized. Her brain doesn't construct the landmarks and skeleton for her.
Re: (Score:2)
Wow, just gonna go to work now. I thought GP was a reply to my comment below this one.
Guess I have no spatial relations either. My apologies, GP.
Prosopagnosia (Score:4, Interesting)
My wife is face blind, and this leads to an interesting situation where I'll say two people look similar (because of underlying facial structure), and she'll say they look completely different.
This story makes sense in that light, because other tasks requiring the arrangement or orientation of things to be assessed is not her strong point.
Re: (Score:2)
Human brains have a fairly large area dedicated to recognizing faces. Some people have brain damage in that area, and can still see properly, and describe all the facial details, but lose the ability to recognize people, even close family members.
That's probably not strictly related to this article.
Re: (Score:2)
Not particularly, but it does speak to the fact that us humans have many large dedicated neural nets each devoted to solving some specific type of recognition problem.
Re: (Score:3)
Interesting. That is apparently not uncommon. Is it only for human faces, or also for animals? And also for other objects that faces?
I wondered about this a few days ago and perhaps I can ask you now: Does she have difficulty identifying objects if they are upside down?
If you train a naive image recognition network with cats in only the normal direction, it will probably have no clue what an upside down picture of a cat depicts. Does that resonate with how your wife experiences her environment?
Apologies for
That is interesting (Score:4, Interesting)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
It is far more than wire-frame/skeletal recognition... and it's not. As mentioned above, we humans have many large dedicated neural nets each devoted to solving some specific type of recognition or cognition problem.
In fact, this "stick figure" approach to recognition might go a long way towards explaining the "ness" problem. What is chair-ness or table-ness or cat-ness? Why is it that you can look at hundreds of different types of tables or chairs or cats and instantly classify them as "table" or "chair" o
Re: (Score:2)
Re: (Score:1)
Whatever allows us to identify objects must exist separately in both hemispheres of the brain and in specific places.
I don't think that's a valid conclusion. Both hemispheres get images from both eyes: the left hemisphere gets the right half of your view (from both eyes), the right hemisphere gets the left half. You're experience might instead suggest that object identification takes place before the signal from each eye reaches the brain, at least in part.
Re: (Score:2)
Re: (Score:1)
Obvious (Score:2)
"people do not evaluate an object like a computer processing pixels, but based on an imagined internal skeleton"
Who would've thought? I think it's been clear for ages that the mind doesn't map from "pixels" to concepts directly (even apart from the fact that the brain/retina combo is totally not pixel based, save for the most physical aspect of light capture, which is the field of photosensitive cells spread all over the retina, kindof). It's been known that there's deep, graph-like processing path (not mer
Re: (Score:2)
I think it's been clear for ages that the mind doesn't map from "pixels" to concepts directly
And it's also wrong to claim that a image recognition system on a computer does that.
Ya, but ... (Score:2)
What do people visualize to recognize skeletons? Checkmate.
really? (Score:2)
Plato and Aristotle already knew this (Score:2)
Bullshit (Score:2)
Cartoon dogs can be a looooong way from any real "skeleton" regardless of how we define that.
I'm bored with AI research that claims that this week's pattern matching is revelatory. I'm just plain old bored with AI, in fact.
Armature object (Score:2)
Cartoon dogs can be a looooong way from any real "skeleton" regardless of how we define that.
How do you think the parts of the CGI dog are animated, other than by mapping the vertices of its surface mesh to one or more nearby bones on an armature?
Re: (Score:2)
Internal skeleton? (Score:3)
Interesting (Score:1)
When they say "people" (Score:2)
"... people do not evaluate an object like a computer processing pixels, but based on an imagined internal skeleton." When they say "people", what they really mean is the 42 people who participated in the study. Maybe, in this case, it's true of almost all people, but sometimes it isn't. I specifically remember another psychology study which concluded that people process time with past on the left and future on the right [uni-tuebingen.de], based on experiments with 30 to 118 participants -- all college students. Such homogen
Neural networks use edge detection! (Score:2)
The researchers also compared how well other models, such as neural networks (artificial intelligence-based systems) and pixel-based evaluations of the objects, predicted people's decisions. While the other models matched performance on the task relatively well, the skeletal model always won.
Wait, what? Are they saying their neural networks don't recognize images by extracting skeletal features? How do they know?
Some of the oldest machine learning (not deep learning) algorithms use things like the structure tensor or gradient approximation methods to skeletonize images before classifying them. For deep networks it is not as easy to figure out what features are being extracted at each convolutional layer, but some form of edge detection is always used somewhere.
A set of humans that happen to per
Hat-tip: Stephen Green at Instapundit (Score:2)
https://pjmedia.com/instapundi... [pjmedia.com]