DARPA Starts Ultimate Language Translation Project 123
An anonymous reader writes "Defense Advanced Research Projects Agency (DARPA) has launched the ultimate speech translation engine project that would be capable of real-time interpretation of television and radio programs as well as printed or online textual information in order to be summarized, abstracted, and presented to human analysts emphasizing points of particular interest." If combined with the tower of babel project we discussed earlier, it could only lead to awesomeness.
Wow that would be handy (Score:1)
Re: (Score:2)
Re:Wow that would be handy (Score:4, Interesting)
Re: (Score:2)
Like that internet boondoggle?
I keed, I keed
Re: (Score:2)
Re: (Score:1)
Seriously though, I just don't believe it. (Score:3, Insightful)
If the US military had anything close to real A.I., you wouldn't hear about it. It would be a classified information.
The NSA would love to have anything close to a system capable of understanding language as well as a native speaker can; as would the CIA, or any other clandestine organization. Any system smart enough to understand and generate English probably al
Actually... (Score:3, Interesting)
You have to walk before you can run (Score:5, Insightful)
This is a big pipe dream that is extremely unlikely to work any time soon. How do I know that? Right now, I think it would be reasonable to conclude that computer technology today is good enough to do accurate text translation. Can it? Well, it depends on how picky you are. There are always mistakes, sometimes glaring ones, in text to text translation programs. I can speak Russian and for convenience (to get quick rough translations) at one time I owned what is probably the best Russian-English text translation program. It's much more accurate than Babelfish. It still left a lot to be desired. It would be about 80-90% accurate, but no more. I remember one time when it took a statement in Russian that said "I absolutely would not mind to tell you about
She sings like an angel.
In this sentence, "like" is an adverb, but it can also be a verb ("She likes to go shopping."). A text translation program might fail to correctly understand that "like" is an adverb here and say something like:
She sings and angel is pleasing to her.
I could give a lot more examples, but these are enough. If we can't even do a better job right now at text translation, how on earth is DARPA going to get speech translation right? This is the kind of project that gets funded by idiots who have never studied foreign languages and believe that the Star Trek idea of a Universal Translator is only a few years away.
Re: (Score:2)
Time flies like an arrow.
Fruit flies like a banana.
Always liked that one.
Re: (Score:1)
So then, we have:
...and...
Talk about name-mangling... :)
"Pie in the sky" E-J-E = "The fleeting desire" (Score:1)
I'd agree, as there are two fields here that are extremely difficult -- voice recognition and machine translation -- which makes this all seem like so much pie in the sky. Anyone who's ever used voice recognition knows how spotty it can be, and anyone who's ever played with Babelfish (like this guy [tashian.com]) knows how much humour can result. Now imagine these two lovely examples put to use on the battlefield, or at intel HQ, and some very unhappy possibilities arise.
I'm all a fan of research for learning's sak
Really I hate it (Score:1)
But seriously, I've basically made my living off of DARPA grants and I fully support the criticism leveled at them above. It is truly a classic government buearacracy, very wasteful, not entirely straight about what they are doing, and you have to have personal connections to get money from them.
Re: (Score:1)
In other words, just like 90% of the rest of the world.
Autobots, Transform! (Score:2)
In unrelated news, a user named DARPABOT has made the Slashdot Hall of Fame [slashdot.org] under most active submitters at over 1000 in under a few weeks time, crushing prost
Take us to your leaders... (Score:2)
Awesome? WTF?? (Score:4, Insightful)
Surveillance of civilian populations under the guise of "monitoring terrorists" is not something that I'd consider awesome. Irksome, yes. But not awesome.
Comment removed (Score:5, Interesting)
Re: (Score:2)
Assuming that this system can recognize voice well, and then convert it into text in preparation for translation, this is already saying a lot. This means that phone conversations can in theory be automatically logged as text, which requires much less storage space than audio.
-b.
Re: (Score:1)
Re: (Score:2)
Depends what you want to do with it, and assuming that our court system is intact and more or less unchanged in 20 years. Besides, there's always the option of kidnapping and "disappearing" miscreants. I'd hate to see what would happen, with the full consent of the majority of the lumpenproleteriat, if another 9/11-scale (or worse) terrorist attack occurred on US soil.
-b.
Re: (Score:2)
The Department of Defense isn't particularly interested in evidence. Indeed, in many cases once they have the information they need to make a decision and the decision is made, it seems they'd be happier if the underlying original data was irretrievably lost to prevent any after-the-fact criticism of either their decisions or their methods.
Re: (Score:2)
Defense Language Institute (Score:2)
>the government already has the language skills it needs even without a whizbang translation machine.
Sadly, they don't. The FBI has something like two guys who speak Arabic, and there are numerous instances in the news recently where some fed is bewailing the lack of language skills in his department. On a diplomatic note, how many US Ambassadors [state.gov] actually speak the language of their host country? It might be useful if they had some way to understand the locals.
Re: (Score:1)
At the time of the Islamic Revoluation, the CIA had one employee who spoke Farsi, and they weren't listening to him anyway. I can't imagine much has
I read that as... (Score:1)
Don't be too afraid... (Score:2)
Gale is about TV/Radio news, not random people conversations.
OG.
Re: (Score:2)
now the government will be able to spy on you in your native language
I can imagine your typical terrorist conversation translated using this system :
-After this the friends when it jumps the operation?
-Not white I go seeing
-Into the correspondence, and to know, you have?
-I am caused
-And on the other hand are their blond as?
-It goes, or
-Or he has
-The God is large!
-The God is large!
Re:Awesome? WTF?? This... could... (Score:2)
Ultimate Defense (Score:5, Funny)
Just feed this new system a few reruns of Japanese television game shows. After that, we will be safe from automated snooping for at least another decade. As a plus, all artificial intelligence projects at the DARPA will be set back by another decade as well.
Humans??? (Score:2, Insightful)
http://lyricslist.com/lyrics/artist_albums/16/ac-
Re: (Score:3, Funny)
Re:Humans??? (Score:4, Interesting)
Re: (Score:2)
Re: (Score:2)
They're slow, and scarce and don't work 24-7. *If* the software has progressed to the point that it's "good enough" (that's a big IF) then a massive farm of machines could simultaneously monitor all communications (tv, email, phone, IM, etc.), summarize, and filter out anything interesting, looking for trends. Think Really Big Brother.
Lots of reasons (Score:5, Insightful)
I reminds me of the old joke:
Guard: Now tell me where you hid the money, or you will suffer
Translator: Tell him where the money is, or you will suffer
Prisoner: I'll never speak
Translator: He says he won't tell you
Guard: *putting gun to prisoner's head* Tell him I will blow his brains out if he doesn't tell me immediately
Translator: He will shoot you in the head unless you tell him now
Prisoner: I buried a million dollars under the floorboards in the old woodshed
Translator: *pauses* He says you don't have the guts to shoot him...
Re: (Score:3, Insightful)
The number of humans that the Pentagon can afford to employ with adequate skill in the languages it wants to target are inadequate to process all the channels of information it would like to filter for potentially interesting information, further, the more humans know what information is being looked for (and what is flagged), the greater the security risk.
Three words. (Score:2)
not enough of them (Score:2)
Several terrorists in Colorado Supermax prisons sent over a hundred unread Arabic letters overseas because they have just one part time guy reading them down there. Quite a scandal there.
Cool (Score:1)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
awesomeness, in terms of megatons (Score:2)
Re: (Score:1)
Re: (Score:2)
I don't what to know what people really think of me.
When will it affect me? (Score:3, Interesting)
Re: (Score:3, Funny)
Effective translation might be impossible (Score:2)
Then, of course there are cultural differences too. A Xhosa girl would like to be told "You remind me of that fat cow over there", whereas your average American chick might not.
Re: (Score:1)
i
can't help it (Score:2)
i can see a translated japanese movie coming...
"all your base are belong to us, make your time"
Babelfish: "There is a chisel in my dog." Me: WTF? (Score:1)
Seriously, try it. Input the sentence, "My dog has fleas." Go from English to Japanese, copy and paste the Japanese into the entry box, and translate back to English. "There is a chisel in my dog."
Just one of many reasons that I'm not that worried about my career as a Japanese - English translator. :P
Too much too soon, or tackling wrong problem? (Score:5, Interesting)
Speech Recognition is the hardest problem to tackle on the path to recognition, and MUST be addressed before there is a viable real-time (or even delayed) translation engine. Currently, even the best speech recognition software can achieve at best ~80% accuracy when faced with a large vocabulary with no limits on speakers/dialects, and this level of accuracy is typically not achieved in real-time. While this 80% level is actually pretty good when transcribing to text (since the reader can typically decipher what the computer meant), it's downright awful if trying to translate the resulting text to another language.
For example, if I say "I like ice cream" into voice recognition software and 'hears' "I like, I scream", the reader might understand what this means, particularly if they say it in context and aloud. However, let's say we translate each sentence into Spanish ("Tengo gusto del helado" and "Tengo gusto, yo grito" respectively, according to Babel Fish), and the speaker would be completely lost as the out of context phrases don't sound anything alike. In a natural language translation, even under relatively accurate recognition scenarios, would be frought with misunderstandings.
Once speech recognition is tackled, it's just a matter of translation then voice synthesis. Fortunately these problems aren't nearly as difficult, and current solutions would suffice (with the only pitfall being poor grammer in the destination language, and a robotic sounding voice).
Re: (Score:2)
*Good* translation is extremely difficult unless you stick to "see Dick run"-type sentences. Good translation between non-related languages (like Japanese and any Indo-European language) doubly so.
Advanced machine translation on par with human will require nothing less than artificial intelligence, most likely.
Re: (Score:2)
I think what you're trying to say is that DARPA isn't capable of developing speech recognition software equal to the task of real-time translation.
I'm sure that DARPA is fully aware that the biggest block to real-time translation is speech recognition -- that's why they are funding this project -- because it is (1) beyond the scope of what private enterprise is currently capable of without cash influx an
Re: (Score:2)
On the other hand, text translation is much harder in Chinese than in romance languages, in large part because of the lack of conjugation, etc.
Re: (Score:1)
Re: (Score:2)
See, even for the most simple translation of "I like" in Spanish, Babelfish is wrong. The good translation is "Me gusto", not "Tengo gusto" which means "I have taste".
I am trying to learn Cantonese and you have no idea just how "stupid" the current translators are...
Re: (Score:2)
So, what great parent wrote as "I like iceream" would be translated as "Me gusta el helado" and "I like, I scream" would be "Me gusta, yo grito".
I agree with GP about the speech recognition problem being one of the problems to cope before having a real-time translation tool. Some have said that the current technology (Dragon Speaking 9, etc) achieve 90% of accuaracy, but the issue is that
You don't know what you're talking about... (Score:2)
OG.
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
The conversational telephone speech (CTS) results I quoted above were achieved using a state-of-the-art research system running under 10 times real time (10xRT); i.e., using less than 10 hours to transcribe an hour of speech. The winning system in 2004 DARPA EARS evaluation achieved 15.2% WER. For system description, see this paper [ieee.org] (requires subscription to ieeexplore). In 2004, many EARS teams achieved the same level of performance in real time as their 10xRT system in 2003. Since EARS program was killed a
Poor grammar in target can be loss of meaning (Score:1)
As I just noted in another post, [slashdot.org] the current publicly available state of machine translation gives me little to fear as a professional translator. You note:
I'd like to point out that "poor grammar" can often have disastrous consequences for the meaning. Take my previous example, "My dog has fleas." Babelfish's Japanese output is "Watashi no inu ni nomi ga aru." This backtranslates to "There is a chisel in my dog." The bolded
Re: (Score:2)
OG.
This could be dangerous... (Score:3, Funny)
Civilian use of such a thing (Score:1)
So they will have to develop one.
This will be integrated into VCRs to stop/start recording when advertising starts/stops.
Great!
Yeah... Billions of Dollars Later... (Score:1)
yeah.. awesomeness... (Score:2)
"Hey... didn't I make them all speak different languages to teach those uppity humans a lesson? Now they what? The end routed me on that one? Oh I don't think so!"
It's coming true! (Score:1)
The space station is being built again. India is planning manned missions into space [reuters.com]. A shift in power in the US Government [go.com]. Now we're creating a Universal Translator [startrek.com]! How exciting these times we live in.
But will it translate... (Score:2)
...Romulan, Klingon, and Vulcan?
They need to start working on... (Score:1)
Interesting, (Score:3, Insightful)
But beyond that, I wouldn't give too much faith in any kind of mechanical translation as particularly reliable on its own except on narrow kinds of material. It conceivably might work for strictly literal usages, or for fairly stable idiomatic uses, but unless you have frequent collection and incorporation of usage data from every culture and subculture that may be a source of translated material, its going to fail, sometimes subtly and sometimes spectacularly, for a lot of idiom. Similarly, even within the same language, different groups using it will have different idiomatic uses that sometimes will produce different or opposing meanings for similar usages, which will require accurate identification of the source at more than just the language level to get correct results from. There's a lot of evolving cultural context that informs the use of language...
exponential growth (Score:2)
Well, err, yes, but, I have enough difficulty understanding Jordies and Glaswegans, and they're speaking the same language as me (nominally). Understanding 200 or so words when carefully spoken is a huge step from simultaneously interpreting random speech and I'm sure the problems will rise exponenti
Re: (Score:2)
screaming "failure" (Score:2)
Language parsing impossible by current technology (Score:3, Insightful)
Of course if you narrow the problem down to specific terms, then it is doable. But then it would not be 'ultimate' any more.
Re: (Score:2)
The English solved this years ago (Score:2)
Re: (Score:1)
Politics of scarcity? (Score:2)
The point being, if this tech works, great, but will it be used?
AHA! (Score:1)
Errr...I mean, soon, it won't be a plot device anymore!
Crap, I mean, eventually it might not be just a plot device...
I mean...oh, fuck it. This is DARPA after all.
How to Wreck a Nice Beach (Score:3, Insightful)
Just say the title out loud to get some idea of why speech recognition is hard, nevermind translation. Translation has long been regarded as "AI-complete" because to do it well you have to understand what is being said, which involves solving all the other difficult AI problems. The current translation systems are lousy because they don't understand what is being said and most of them don't even attempt to.
So my guess is that this program will be a boondoggle for researchers with little practical result.
You can't impeech his speach :) (Score:1, Funny)
Real-time translation huh? (Score:1)
I'm sure the multi-lingual people out there are laughing at the very concept of "Real-time translation". Unless you're doing something trivial (Italian to Spanish?), this just isn't possible.
Some languages place verbs at the very end of the sentence. Assuming that the computer could understand each of the words, the entire sentence still has to be re-composed in English. For long sentences, the speaker has already moved on.
Other languages, like French, use some crazy sentence structures that effective
Will the fansubs be available on torrent? (Score:1)
If it could... (Score:1)
Been done... (Score:1)
OR.... (Score:2)
OR
We could start learning some foreign languages. Everyone who graduates high school should learn at least two. Fluently. And no, not Spanish. A language NOT spoken by your neighbors. A *foreign* language. Arabic would be damned helpful.
Re: (Score:2)
Not Going To Happen (Score:2)
Fergeddaboutit.
Until conceptual processing is able to be performed, ANY form of human language translation will be inadequate. It might be usable in some respects, but not adequate for most real purposes.
It does seem cool... (Score:1)
Re: (Score:1)