Slashdot Log In
Translation Software That Learns by Reading
Posted by
samzenpus
on Wed Feb 23, 2005 08:54 PM
from the it-is-fundamental dept.
from the it-is-fundamental dept.
redcone writes "New Scientist is reporting that translation software that develops an understanding of languages by scanning through thousands of previously translated documents has been released by U.S. researchers. According to the article "The translated documents used to teach the translation algorithms can be electronic, on paper, or even audio files. The system is not only faster than other methods, but also better suited to tackling less common languages and the unusual vocabulary found in specialised or technical texts.""
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading ... Please wait.

technical texts (Score:4, Funny)
Yay! (Score:4, Funny)
Harry Potter and the Bible (Score:5, Interesting)
This could be great if it were opensourced. It'd be nice to translate email, instant messages, websites, technical docs, and lots of other stuff we're currently using the fish for. The fish is nice but not that effecient to add to other programs and it's translations aren't usually that great.
Re:Harry Potter and the Bible (Score:5, Funny)
Or did JK Rowling suddenly become pious?
Wow! Does a much better job... (Score:5, Funny)
Not hard wares that sticks an comprehension of talks by scanning on thousands of fish translated papers has been vomited by US scientists.
Many existing translation not hard wares uses palm rules for botching words and phrases. But the new software, snarked by Kevin Knight and Daniel Marcu at the Information Sciences[...]
Read More... [newscientist.com]
That's great.... (Score:4, Funny)
That sounds like a good approach (Score:4, Insightful)
But if you give computers a bunch of human stuff to read, you expose the dictionaries to language as it is actually used, not just as the dictionary has it. Then when odd language usage falls upon us like it's raining cats and dogs, they will have a database of similar usage to draw upon. Hey, it's an uphill climb, but this is a good avenue to try. Cheerio, computers, and a top o' the mornin' to ya.
Philosophical caveat (Score:5, Insightful)
I would say generally that humans able to translate between languages generally understand both languages, but whether a statistical, probabilistic model based on correlations understands a language might be a stretch.
Further reading: Searle's Chinese Room argument- http://en.wikipedia.org/wiki/Chinese_room
This is akin to asking, Does your tax software understand the tax code? Does Photoshop understand the principles of image manipulation?
Are these silly questions to ask?
Further reading: Dennett on intentionality (http://en.wikipedia.org/wiki/Dennett but the entry is pretty sparse).
RD
Re:Philosophical caveat (Score:4, Interesting)
I think that software that can learn can be said to understand a problem just as much as a human can. The difference between understanding and just doing is having the ability to learn from new data and to change your actions as required.
Re:Philosophical caveat (Score:5, Insightful)
Mom baked for three hours.
The pie baked for three hours.
"Mom" and "The pie" are the subjects. The verb and entire predicate are identical. Understanding the language disambiguates these sentences, but the ambiguity is part of what defines humor.
A man walked into a bar. Ouch!
A man wanted to win a pun contest in the local newspaper, so he entered 10 times in order to increase the chances that one of his entries would win. Unfortunately, no pun in ten did.
You can translate that 50 ways from Sunday but without understanding the language - understanding what makes those statements interesting - the machine will lose all their meaning.
Google definitely would buy into this... (Score:5, Interesting)
Translating specialised texts ... (Score:5, Insightful)
The main reason (I think) is that: tech documents have specialised vocabulary and idioms, but these are much fewer than the idioms one has to master in order to understand the editorial page in a newspaper.
With a rudimentary knowledge of Russian and French, I have found it much easier to read an engineering textbook or paper in these languages, than reading any nontechnical text. (This is not necessarily the case with other languages. Any document in Japanese for instance is an entirely different ballgame
Re:Translating specialised texts ... (Score:4, Informative)
Of course that is true, for a human translator. Your knowledge of the technical field itself is a resource you can use to aid in your translation of technical texts. For machines, it's usually necessary to use a translator specifically geared to the subject matter. For instance, you would definitely want to use a different machine translator for a newspaper article as opposed to a biomedical research journal.
This new approach is supposed to mitigate these problems. If they can do a good job of it, they may be able to bring machine translation to areas where previously human translators have been required or greatly preferred.
DadaDodo (Score:4, Informative)
Microsoft Research already does this (Score:5, Informative)
Arabic to English (Score:5, Interesting)
After a quick web search, all I was able to find was this site [sakhr.com], which has a pretty sketchy TOS agreement.
Dragon Naturally Speaking (Score:4, Interesting)
It would be interesting to see the results of analysing large sections of languages however, but the only immediate use I can fathom for this would be for cryptography or information compression algorithms. However the results could probably be used to provide insight into how languages evolve or how memes spread from language to language.
Or the brief explanation in the article did not make it clear enough how this differs from what was previously state-of-the-art, e.g. Dragon.
Time flies like an arrow... (Score:5, Funny)
When an automated translator can handle that one without bursting into flames, I'll start to believe.
How is that news? Research was done 10 years ago. (Score:4, Interesting)
years ago by IBM: The Mathematics of Statistical Machine Translation [upenn.edu]. And even free software has been available for a while, see
http://www.fjoch.com/GIZA++.html [fjoch.com].
It's only a matter of time before... (Score:4, Funny)
No samples? (Score:4, Interesting)
Without even the simplest of examples or samples we have only their word on how well this works.
so how can they grade you in school? (Score:4, Insightful)
Sometimes brute force, ie look up tables for 100000000 translated versions can be better, so much for logic eh
Re:High school Spanish (Score:4, Informative)
Until I see this new process in the works, however, there is nothing that will make me believe it's better than finding another human who can *understand* what you are saying and the context to which you are implying.
efnet spanish (Score:5, Funny)
q w3n0! 3so si está 1337!
Re:translate to American please (Score:5, Funny)
r3Ð(0n3 wr173$ "N3w $(13n71$7 1$ r3p0r71n9 7h47 7r4n$£4710n $07w4r3 7h47 Ð3v3£0p$ 4n nÐ3r$74nÐ1n9 0 £4n9493$ b¥ $(4nn1n9 7hr09h 7h0$4nÐ$ 0 pr3v10$£¥ 7r4n$£473Ð Ð0(m3n7$ h4$ b33n r3£34$3Ð b¥
And translation #2:
REDCONE WRIETS NU SCEINTIST IS R3PORTNG TAHT TRANSLATION R TAHT D3V3LOPS AN UNDERSTANDNG OF LANGUAEGS BY SCANNG THROUGH THOUSANDS OF PREVIOUSLY TRANSLAETD DOCUMENTS HAS B3N REL3AESD BY US!!!! OMG R3S3ARCHARS!!1!1!! LOL ACORDNG 2 DA ARTICL3 TEH TRANSLAETD DOCUMENTS US3D 2 T3ACH TEH TRANSLATION ALGORITHMS CAN B 3LECTRONIC ON PAEPR OR 3V3N AUDIO FIELS!!1111 TEH SYSTEM IS NOT ONLY FASTER THAN OTH3R M3THODS BUT ALSO BT3R SUIETD 2 TAKLNG LAS COMON LANGUAEGS AND TEH UNUSUAL VOCABULARY FOUND IN SPACIALIESD OR TECHNICAL TEXTS!1!! WTF