Researchers Work To Perfect Computerized Lip Reading 117
Iddo Genuth writes "Researchers at the University of East Anglia are working to develop computerized lip-reading systems. Lip-reading is extremely hard for humans to master, but a software-based system has several benefits over even the most highly trained expert. The ultimate goal of the project is to convert lip-read speech into text. 'Apart from being extremely helpful to hearing-disabled individuals, researchers say that such a system could be used to noiselessly dictate commands to electronic devices equipped with a simple camera - like mobile phones, microwaves or even a car's dashboard. England's Home Office Scientific Development Branch ... is currently investigating the feasibility of using lip-reading software as an additional tool for gathering information about criminals or for collecting evidence.'"
This could be a problem. (Score:4, Funny)
1: Go in the D pod with Frank.
2: Turn off sound.
3: Plan disconnection of HAL.
4: Leave D pod.
5: Check out slashdot's 7 year firehose backlog before executing your plans.
6: Get that sinking feeling of impending doom.
Re: (Score:1)
Re: (Score:1)
Wow, it's like a 2001 text adventure. Ummm, ask the other guy what his name is.
Re:This could be a problem. (Score:4, Funny)
You could try... (Score:4, Funny)
3b. Hope HAL doesn't have the Klingon i18n package installed.
Or...
3a. XOR the output from HAL's camera with the output from the output from a chip manufacturing security camera. The AI porn'll distract HAL for long enough.
Re: (Score:2, Funny)
You sure you didn't step out the airlock?
Re: (Score:1)
I'm sorry you feel the way you do, Dave. If you'd like to check my service record, you'll see it's completely without error.
Bugger the Queen (Score:1)
Bush Sr.? (Score:4, Funny)
-uso.
Re: (Score:2)
Haha (Score:4, Insightful)
Re: (Score:2)
Re: (Score:1)
It's sad but yeah, the biggest use for this will be spying.
HAL? (Score:2, Funny)
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Has anyone coined a term for personas like you?
Re: (Score:1)
I know I've made some very poor decisions recently, but I can give you my complete assurance that my work will be back to normal. I've still got the greatest enthusiasm and confidence in the mission. And I want to help you.
So much for blind gynecologists ... (Score:3, Funny)
... no more lip reading for them.
Let me be the first... (Score:5, Insightful)
Re: (Score:1)
Love affair with voice control (Score:4, Insightful)
Anyway, this gets me to privacy stuff. As computers try to understand us more, we'll need to interact in a more 'human' fashion - talking more, or doing things that would attract the attention of other humans (and also the computers). It's late, and I'm rambling here a bit, but remember how voice-controlled computers were going to take over a few years back? Everyone was just going to be talking to their computers to get stuff done. In reality, that would be a complete disaster in office environments, as there's generally too much noise already. Replacing all the typing you hear with voices. Ugh...
So, if I need to talk to a computer, but do it quietly, it can just read my lips, right? Or can I just mouth the words and have it understand that? I've found that when I try to 'mouth' words silently to someone across a room, I tend to exaggerate my mouth's movements, so perhaps that would be a better thing for the computers to be able to 'parse'.???
I see real application for this technology in niche areas, but am not sure it'll become 'mainstream' any time soon (like, 5-10 years). We'll need to rethink our physical world - offices, cars, and such - before these sorts of new HCI systems can really be integrated in to our day to day lives productively.
Why voice control doesn't work. (Score:2, Insightful)
Imagine trying to watch "Top Gun" in your DVD player ...
Maverick (Tom Cruise): "Eject eject eject!" ... (*bzzzzt* disk pops out of player).
Or "Law and Order"
Cop on TV: "Stop!" (*click* - TV turns off)
Victim on TV: "He shot her! Call 911!" (*beep beep beep* - your phone dials 911, reports a shooting, SWAT team shows up at your door, taser you just because!)
Or a political broadcast:
Candidate on TV: "Vote for Me"
Your computer: "I have just registered your vote for (insert candidate on TV) as per
Re: (Score:2)
Watching Trek I've often wondered why the computer didn't think people were talking to it every time the word "computer" came up in conversation.
Crazies everywhere (Score:2)
Re: (Score:1)
Nothing unusual about that to me, even before cell phones. Then again, I'm from NYC so maybe my experiences have been skewed a bit.
Re: (Score:1)
Re: (Score:1)
So, if I need to talk to a computer, but do it quietly, it can just read my lips, right? Or can I just mouth the words and have it understand that? I've found that when I try to 'mouth' words silently to someone across a room, I tend to exaggerate my mouth's movements, so perhaps that would be a better thing for the computers to be able to 'parse'.???
I think that if you looked at the broad range of mouth movements people make, the patterns for words would be similar regardless. The way that I say "wash" will differ from the way that you do, but there certainly have to be striking amounts of similarities that would be able to be distinguished by this technology. Even if you over-exaggerate your mouths movement trying to say something quietly, that over-exaggeration would likely still fall within the normal patterns that the technology would expect and a
Well (Score:4, Insightful)
On the other hand, it could also be used as a tool for additional unnecessary surveillance.
Re: (Score:1)
Can you really imagine our Fearless Leaders not using this tool to monitor dissent? Pfft!
Re: (Score:1)
Therefore it is quite probable that a speech recognition system can perform better with a lip reading module...
You would have two independent signals containing the same information. The errors in the audio signal would not be the same errors in the visual signal. Sum the two signals for a stronger information-to-noise ratio. If it were done correctly, any visual clue should improve the ability to intrepet the aud
Obligatory meme (Score:2)
Re: (Score:2)
Re: (Score:2)
We don't bother actually reading his lips
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Universal Translator (Score:2)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
But as an improvement over current computer voice recognition translators, yes.
Great! (Score:3, Funny)
Doctor: "I diagnose lip strain and recommend no kissing for 6 months."
Patient: "That's easy! I am a geek. I haven't kissed anyone since my aunt last visited me in 2001."
time (Score:5, Funny)
Re: (Score:2)
Re: (Score:2)
How is this more effective than speech to text? (Score:1)
It seems like differences between people and the way they talk would have much more subtle variations as far as lip reading is concerned. The difference between words like 'cat' and 'hat' are much more obvious in speech than they are in lip movements, or at least thats how it seems to me.
The 'speechless dictation' thing doesn't make much sense to me either. Sitting here at work and messing around a bit with th
Re: (Score:2)
Can anyone explain a reason why lip reading would be more effective than speech? I'd love to know.
Perhaps in some situations where video is available but not audio? That's the best I can come up with.
Re: (Score:2)
Or even imagine a video (very high-res, though) of a large room, or a public gathering with many people talking at the same time, which would absolutely confuse any speech recognition (and people listening as well), but lipreading could understand ALL of the things everyone said, if it works pro
Re:The killer lips to text app is.. (Score:1)
In a really noisy street, with large trucks and SUVs crushing you round, the noise is terrible. But with new 'Liposuk' the words are sucked right out of your mouth onto the memory stick. Now with 'frequent phrase conversion', we can highlight in red, great last liners such as:
He didn't indicate. Arrgh.
Ice. Arrgh.
Lemme think about thisarrgh.
Arseharrgh.
Arrgh.
Another great bin Laden product, brought to you by Darwinware Inc.
Re: (Score:1)
Silent films given voice (Score:4, Interesting)
Already been done (Score:2, Informative)
Ventriloquism (Score:2, Funny)
HAL's way ahead of you. (Score:2)
HAL: Affirmative, Dave, I read you.
Dave Bowman: Open the pod bay doors, HAL.
HAL: I'm sorry Dave, I'm afraid I can't do that.
Dave Bowman: What's the problem?
HAL: I think you know what the problem is just as well as I do.
Dave Bowman: What are you talking about, HAL?
HAL: This mission is too important for me to allow you to jeopardize it.
Dave Bowman: I don't know what you're talking about, HAL?
HAL: I know you and Frank were planning to disconnect me, and I'm
Re: (Score:3, Interesting)
Speech recognition uses (Score:3, Interesting)
Re: (Score:1)
Re: (Score:1)
Were you out on a date with a deaf woman and when you told her you wanted to fuck her she brought you fruit?
Already Done for Noisy Environments (Score:2, Informative)
Sounds like TFA is talking about doing this in an embedded, consumer-electronics application. Rather than a fixed, industrial-military, hire-computer-scientists-to-maintain-it thing.
Not-so-coincidentally,
What's the real story? (Score:5, Informative)
TFA links to a paper that's actually about exaggerating lip motion to improve recognition, which seems like an interesting topic, at least new to me. But it's seemingly unrelated to the reporting or any governments protecting us from our rights.
From the Abstract:
Will they stop this terrorist? (Score:1)
The perfect complement (Score:1)
Any ventriloquists care to comment? (Score:2)
Better keep a stiff lower lip, too (Score:3, Insightful)
Would it be asking too much to have this worded as "gathering information about possible criminals"? (Or "suspected" or "alleged" would be ok.) The text quoted above, which is absent such an adjective, comes straight out of the article, and may or may not be how the Home Office refers to it, but anyone engaged in public dialog on this matter (and preferrably those people when doing their research) should strive to be meticulous on this point.
As soon as one loses that little bit of description, one is able to be much more cavalier about the loss of human privacy involved. It's one thing to rough up terrorists at the airport--who doesn't want that? But "possible terrorists" is just a synonym for "everyone". So when we say it's ok to rough up possible terrorists, we're saying it's ok to rough up anyone. And we can learn to think twice about that. Likewise, when we say it's ok to surveil the lip movements of "potential terrorists", we're saying it's ok to log everyone's private conversations. So let's be clear about that.
Saying we're just watching the lip movements of criminals isn't right. If we knew they were criminals, we would (for the most part) be arresting them. (Yes, yes, we might sometimes leave them on the street to lead us to their friends. But I don't think that's the only use that this technology will be put to.)
And how long until someone's lip movements are taken as a confession. Or as a justification for an otherwise-illegal search? The word "not" doesn't involve much movement of the lips. Lip-reading "I did not kill him." could easily look like "I did kill him." Will we be telling people that in order to stay clear of these things, we need to be more clear about our lip movements, just in case they're misconstrued?
Perhaps a stiff upper lip will give way evolutionarily to stiffening of both lips when talking, just as a form of personal protection. How sad. And worse if, as seems likely, dedicated criminals eventually learn the skill of not moving their lips while talking, and so that really only non-criminals become usefully tracked this way. Or perhaps it will become suspicious when one doesn't move one's lips, as it's probably inappropriately regarded by law enforcement as suspicious when one encrypts things. Then there will be the uncomfortable choice between hiding your communications and looking suspicious, or exposing your communications to misperception.
The data is out there. Lips convey meaning. So it's inevitable that this technology will occur. But the uses to which it may reasonably be put are in control of the people--at least in countries where the people have some say in government. Let's hope they build up some reasonable guidelines on appropriate vs inappropriate uses quickly.
Re: (Score:2)
I myself, though no Islamist, wear a full set of facial hair, since I can never be bothered shaving, and also move my lips very little when speaking due to my bad teeth.
Good luck reading my lips, Jacqui Smith!
P.S. When looking at Ms Smith, the phrase I am most likely to be uttering is: I wish she'd put that bloody cleavage away -
No more shouting at your computer (Score:2)
England's Home Office? (Score:2)
For any confused Americans, it's akin to stating "California's Department of Homeland Security..."
Re: (Score:1)
A very good point for those specifically from the US, however
Re: (Score:2)
Human language is too complicated (Score:2)
Countermeasures (Score:2)
2) If both parties are aware of such devices and are prepared they could move their lips to mouth decoy words and only vocalize the non-decoy words to carry the meaning they want.
3) Use a different language especially one which is less reliant on lip movement. People can communicate in Mandarin (or other chinese dialects - Cantonese etc) without having to move their lips much (if at all).
Lip reading is not purely science (Score:2)
There have been several cases in UK courts where lip reading of CCTV footage was used as evidence, but there have been doubts cast over the technique by defence lawyers and journalists. Like fingerprint matching, lip reading is open to interpretation. Most people who use it also use some limited hearing or sign language to supplement it.
Read my lips: (Score:2)
...with knobs on top.
Ironicacy (Score:1)
(With apologies to Mike Judge)
- RG>
Convergence of technlogy. (Score:2)
Combine speech recognition, bionic contacts (http://science.slashdot.org/article.pl?sid=08/01/17/1921217), and this lipreading software, and you've got realtime captioning/subtitles for the deaf.
Surgical mask (Score:1)
Yeah, right... (Score:2)
Good luck, guys.
qwerty (Score:1)
Olive Juice. (Score:2)
Great stuff! (Score:2)
Jane (Score:1)
Geography! (Score:1)
At least until Scotland & Wales become fully independent.
To put in a way that USonians might understand: it's no more England's Home office than the CIA is Virginia's.
I don't get it (Score:1)
Isn't this useless until someone first invents computerized lips?