Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Science Technology

The Status Quo Of Computer Vision 69

prostoalex writes "The Industrial Physicist sums up the recent advances and developments in the world of computer vision. They mention an application for human-computer interfacing using a Webcam, Philips Research Lab Seeing with Sound product, which augments vision for visually impaired, as well as various frontal face detection applications."
This discussion has been archived. No new comments can be posted.

The Status Quo Of Computer Vision

Comments Filter:
  • by Metallic Matty ( 579124 ) on Sunday March 23, 2003 @06:14PM (#5579813)
    I'd have to say computers generally have very good vision - I am yet to see one wearing a pair of glasses.
    • My computer must need glasses.
      It cannot tell which window I am looking at when I type. I have to tell it which window to use by clicking with the mouse.

      With better computer vision, the computer would know which window I was looking at when I started typing. With highly acute vision, it could even know which button I was looking at when I hit the return key.

      I will know that computer vision is a reality when I no longer need to use a mouse. Likewise, I will know that speech recognition is a reality

      • That's actually a pretty good idea for using computer vision. The one problem I've always had with a GUI is the overhead in switching windows and such.

        I think having your current window in front, with others as small pictures at the top, getting smaller as unused longer (with a limit of course) would be great--you could just look at a window and it would pop up.

    • Of course not, they haven't got ears.
  • by Blaine Hilton ( 626259 ) on Sunday March 23, 2003 @06:15PM (#5579821) Homepage
    Another big advance I think will come with the prodding of the DARPA's $1million contest. A lot of disccusion has been going on their message boards about computer vision systems.
    • by Animats ( 122034 ) on Sunday March 23, 2003 @06:27PM (#5579886) Homepage
      That's not doing much for computer vision. Most of the action in computer vision right now involves "homeland security" applications, real or imagined. The killer app for computer vision seems to be Big Brother.
      • Most of my research involves adapting the `what's in' in facial recognition and applying it to disease diagnosis and segmentation of heart MRIs. These days' people at developing statistical shape and appearance models of faces via PCA for matching and segmentation. It gets a bit scary what they can do in facial recognition if you start reading up, but it's also slightly disconcerting how much money is being tossed in medical imaging.

  • by Kozz ( 7764 ) on Sunday March 23, 2003 @06:17PM (#5579835)
    At first, the phrase "frontal face detection applications" sounded rather cumbersome. But then a shorter phrase of "facial detection applications" might have been grossly misunderstood. ;)
  • by Powercntrl ( 458442 ) on Sunday March 23, 2003 @06:25PM (#5579882)
    It looks like you're trying to masturbate! Would you like me to load:

    * Your porn collection
    * An AIM conversation with a guy pretending to be female
    * Recommended self pleasuring techniques database
    * Featured lubricant merchants
  • by Neuronerd ( 594981 ) <konrad @ k o e r d i ng.de> on Sunday March 23, 2003 @06:30PM (#5579902) Homepage

    While it is clearly true that only the recent advances in computer speed allowed the Computer Vision Systems we are seeing now there are also other important influences.

    In particular there are really also better algorithms than a number of years ago. Many if not most successful computer vision systems use statistical Methods. In the case of faces for example they often build a probabilistic model of what a face is. Such models know that a face should usually has eyes but not always. That some people have beards etc. And these models train themselves up from a database of stimuli, for example real faces.

    A number of recent advances makes such probabilistic models fast enough to work well on real world data. In a sense is the problem of computer vision very similar to the problem of understanding a voice or extracting the highest possible bitrate from a stream of data transmitted via a telephone line. And indeed the resulting algorithms are often surprisingly similar

  • Daredevil actually premiered in The Netherlands this week, so I was kind of expecting them to suddenly come up with some kind of "Seeing with Sound"-tool. These guys are so predictable.
  • by gmuslera ( 3436 ) on Sunday March 23, 2003 @06:39PM (#5579950) Homepage Journal
    it work well since years ago. Computers running Windows often show me their blue face when I show them mine, even if their owners says that Windows is very stable and they never saw a blue screen before. Surely Windows can recognize people and do this specifically to me.
    • Linux, too! (Score:2, Funny)

      by spanky1 ( 635767 )
      When I panic, the Linux kernel detects that and also panics. Wow, computers have had facial recognition for a long time!
  • by CrazyJim0 ( 324487 ) on Sunday March 23, 2003 @06:39PM (#5579952)
    All you need is it to understand english, and imagine in a 3d space.

    Type a sentence like Zork, and it makes the scene for you.

    Give it a book, and it could turn it into a movie for you.

    Vision recognition has a great many uses already, but when vision recognition matures, you'll be able to take a scene and reduce it into 3d reality space. You take the 3d reality space, and give the computer some goals, and its trying to accomplish something in the world.

    Thing is, it won't stop at plain vision, you'll get infared, sonar, ultraviolet, radar, all that crap to get the best 3d image possible.

    So since vision is progressing, the gap towards AI is shrinking. Also as video games become more realistic, the AI gap is shrinking. I could be bold and say 15 years from now we should have basic AI.
    • We already have basic AI - we have UAVs that can plan their own route and complete their mission completely autonomously. Most commercial robots that are being released have enough AI to determine where they are, what their goal is, and how to perform that goal. In my opinion, that is definitely "basic" AI.
    • 15 years? HA! Try 50, at best. And that is with MAJOR corporations trying like hell to put the product to market. Have you ever talked to an ALICE bot? That is suppose to be a fairly advanced bot/AI program, and you can stump it like mad, without even trying! Now try and feed it the text of a book and have it understand what the hell is going on?! Reading a classic novel, where inferences have to be made and many MAJOR actions are implied, but not directly talked about, the computer would have to have an AI
    • All you need is it to understand english, and imagine in a 3d space. Type a sentence like Zork, and it makes the scene for you.

      The mapping is probably not direct. It's likely, that there is no straight conversion of words-to-pictures even in our heads. We like to visualize, as it simplifies understanding and memorization. However, there are a lot of words (mostly concepts) that don't evoke any pictures when pronounces. We never encountered them in physical worlds to give us visual representation. Yet, we

    • Sorry, I have to disagree. I think you have some very good ideas about what could possibly be automated, but the devil is in the details. One of the biggest pitfalls is assuming that a computer would somehow be 'smarter' than a human, just because it can perform fast calculations. For instance, you claim that an AI could turn a book into a movie. Great idea, but I know I can't do that myself, and I like to think I'm pretty sharp. I'm also pretty sure that most of the people developing AI can't either, or th
  • Related article (Score:5, Informative)

    by Toasty16 ( 586358 ) on Sunday March 23, 2003 @06:39PM (#5579956) Homepage
    Wired had an article late last year entitled Vision Quest [wired.com] about a similar topic. The doctor couldn't perform most of his techniques in the U.S. due to ethical laws, giving the article a real "Frankenstein" flair. Good read.
  • by blitzoid ( 618964 ) on Sunday March 23, 2003 @06:47PM (#5579998) Homepage
    I live for the day that we can have a computer installed in our bodies have have a HUD in our eyes. How cool would it be to browse the net or play games while doing other, more boring things in the outside world?

    We could be talking about a revolution in isolationism here! I can't wait!
  • All this fantastic technology - and yet here i am using mozilla with linux all fully apt-get upgaded to testing, everything uber-optomised and configured , all is good, smooth and aliased...

    BUT my itallic fonts when on slashdot still look fucking shit by default!

    And don't try and tell me how to set my desktop up properly - check me out:
    I AM THE 'KIN DESKTOP (all your desktops are belong to me now) :^)
  • Nouse-ing (Score:5, Informative)

    by cybermace5 ( 446439 ) <g.ryan@macetech.com> on Sunday March 23, 2003 @07:56PM (#5580269) Homepage Journal
    I downloaded the Nouse, and the Bubble Frenzy demo. My webcam was already on top of my monitor, so all I had to do was run the program.

    All you do is calibrate it by centering your nose in the image and clicking. The program draws a green box around your nose and follows it...it's pretty hilarious. Good oblique lighting seems to work best, too dark or too light and the box will want to follow your chin or ear. Overall, pretty reliable and lots of fun.

    I loaded up the Bubble Frenzy game, which at first looks like a DOS-era Frozen Bubble. The Nouse worked fine...added a bit of challenge, levels I'd laugh at in Frozen Bubble were suddenly difficult. It's hard to keep track of the pointer when your head is moving. It was pretty fun, someone walked in and saw me playing, apparently just hitting the space bar while tilting my head from side to side.

    I had a neck injury a while back in a car accident though, and all this motion started to bring on a little soreness. I had to quit after about 20 minutes of Nouse-ing, about the same effect as an hour of driving.
    • by gad_zuki! ( 70830 )
      Wow, Nouse is the coolest thing I've seen on slashdot in ages. Mod parent up please.

      Nothing like playing games with your nose. Now I'm tempted to borrow a USB2 card for nose to nose pong!
  • by sielwolf ( 246764 ) on Sunday March 23, 2003 @08:19PM (#5580344) Homepage Journal
    Maybe it's some sort of technophilia but some of the posts on here are just pure vapor. Sure, there have been some great advances in computer vision and pattern recognition... but have some of these posters on here ever done any research in the area? Hell, most face recognition goes back to Fischer's 1936 iris data set and primary component analysis... not quite Wintermute stuff.

    Too often vision projects find speedups by sacrificing one or another components. For instance, you can get some great face recognition with PCA... as long as the person's face is immobile. Tilt your head slightly or rotate too much and the system has no clue.

    I'll admit, there is some killer work out there. But not of the full-blown "20 years and we will all have robotic man servants" thing. Keep the hype to a minimum.
    • No kidding. I'm personally quite disappointed with state of the art speech to text, computer vision, etc... Much of it has gone largely unchanged for years, optimizations here and there is about it.

      I think at some point we went down a path which will never lead to the solutions we expected to have by this time. And the reason we can't get off the current path is because of the way the tech culture is, you always have to publish an extension to previous work with copious references.

      And its not even the b

    • Yes perhaps. But it is an exciting field. Vision is THE sense I wouldn't want to lose and it is sad seeing older people disengage from the world as it gets lost. Given my (generation's) connection to computers and visual stimuli just to get by (buy stuff, learn, communicate, et cetera), this technology is sure to be boon to humanity.
    • What totally drives me nuts is most people in that field are totally hooked on the whole fisher-face, eigen-face, ICA, thing. Basically they naively project a two-dimensional affine/brightness normalized face onto a basis function and then do a nearer neighbor on the coefficients to determine identity using some magical distance metric like Mahalanobis or Euclidian. They totally fail when the intensity or pose changes, and then blame it on the distance function or basis function.

      Shape models and combined m
  • Nice Demo (Score:4, Informative)

    by Anonymous Coward on Sunday March 23, 2003 @08:50PM (#5580461)
    People might want to check out these cool pictures and videos from Cambridge University [cam.ac.uk]
    • When looking around the Cambridge Machine Vision URL noted above, I came upon this [cam.ac.uk] little site which looks like an interesting project in Hidden Markov Models; so, take a look at current 'owner' of site and related software license.

      Now idn't that funny?
  • by chameleon1z ( 661172 ) on Sunday March 23, 2003 @09:57PM (#5580721)
    As someone who has been doing research in areas of computer vision, and specifically identification and a member of a Computer Vision Research Laboratory, I just thought I would make a few comments here. Some area's of computer vision, in relation to big brother, have been around for a while and actually work quite well already. These areas include but are not limited to fingerprint, iris, and hand just to name a few. Those mentioned above are already in commercial applications around the country used for everything from secure entry into the country at immigration stations, to secure entry into rooms/labs/whatever, and to confirm identification for logins to computer and other systems. They work well (always some room for improvement), but require a completely willing subject and carry a certain 'stigma' of big brother and criminals with them that makes them less viable. The view mentioned here that researchers want to work towards is having a standard camera (like a security camera) able to identify people. However, despite some claims so far (most recent interesting claim out of Isreal), so far no one has proved to have ANYTHING that would be viable in a real world application. Best systems thus far have never even been tested with a database of over 500 people, most significantly less than that, and tend to not work well over time. Usually, they work fairly well the same day and then exponentially decrease in their effectiveness until around 6 months when you may as well be randomly guessing because you'd do about as well as most algorithms. Overall, I don't think you have anything to fear from big brother here anytime soon.
    • While your broad claim that face recognition is really not ready for a large-scale real-world application is basically correct, your specific claims are anything but. The FERET evaluation of 2000, for instance, used close to 4000 images, not 500, and the 2002 follow-up evaluation made use of far more images than that. Also, the fall off in performance due to the passage of time is not nearly as extreme as you imply. Comparing the 2000 results to the newer study should make it clear that substantial progress
      • Sorry I wasnt specific in what I was saying. I was trying to dumb it down a little. My claim of 500 was meant towards the number of individuals involved in the studies, not the number of images. I believe the number of individuals is MUCH more important in determining the effectiveness in a real world border crossing type situation than the number of images involved.

        While your correct the feret database had more than that (1200 individuals in 2000 I even double checked it witht the feret website) they'r

  • by Boss Sauce ( 655550 ) on Monday March 24, 2003 @03:03AM (#5581924) Homepage Journal
    Gollum was brought to you by vision technology. It takes a lot of specialized cameras like these [vicon.com] to track a lot of dots in 3D. Also, cameras are tracked after the fact by analyzing photography with tools like this [realviz.com] and this [oscars.org] (search for MARS).

    To lump all computer vision together and say "it's not there yet" is phooey! There are lots of problems in vision, and they do get solved, but those problems are all specific-- you can't use a red-light-runner system to do facial tracking...

  • by hyperventilate ( 661218 ) on Monday March 24, 2003 @04:02AM (#5582068)
    I was stunned by how OCR went from "impossible" to "Trvial" and all that changed was moores law making high res scans available in memory in a typical PC. Expect many vision problems to fall by the wayside with new 240 Frames per second 3 megapixel cameras [fast-vision.com]. (Don't save THOSE movies uncompressed!) See the Sensor Spec [fast-vision.com].
  • Eye Tracking (Score:4, Interesting)

    by nycsubway ( 79012 ) on Monday March 24, 2003 @09:43AM (#5582776) Homepage
    This is very similar to a project I worked on in college. We were working on getting a webcam to track eye-gaze and to allow a user to control the mouse with their eye. I have always wanted to continue development of the gaze tracker, but never had the time after graduating. The website is here: http://www.gbook.org/projects/index.html [gbook.org]
  • If researchers really want to understand the mechanism of human face recognition, they should be looking at the cases where it *doesn't* work: autistic people with face blindness.
    • as knowing quite a few autistic people quite personally I would have to chime in and say a test that tried to pinpoint something as specific as this in a severely autistic person would be hard at best, and impossible at worst. The abstractness of this idea is what, in my opinion, would elude an autistic person.
  • N2 Reading [n2reading.com]

    They use computer techniques to help people with Intermittent Central Suppression read. They're fighting the good fight too!
  • This project [microsoft.com] at MS research not only does the face detection, but recognition.

    I can't get the videos to play right now, but when I saw them before, as people walked on and off camera, it would find their face, put a square around it and label their name on it.

    Pretty neat.

Usage: fortune -P [] -a [xsz] [Q: [file]] [rKe9] -v6[+] dataspec ... inputdir