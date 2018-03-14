Microsoft Announces Breakthrough In Chinese-To-English Machine Translation (techcrunch.com) 34
A team of Microsoft researchers announced on Wednesday they've created the first machine translation system that's capable of translating news articles from Chinese to English with the same accuracy as a person. "The company says it's tested the system repeatedly on a sample of around 2,000 sentences from various online newspapers, comparing the result to a person's translation in the process -- and even hiring outside bilingual language consultants to further verify the machine's accuracy," reports TechCrunch. From the report: The sample set, called newstest2017, was released just last fall at the research conference WMT17. Deep neural networks, a method of training A.I. systems, allowed the researchers to create more fluent and natural-sounding translations that take into account broader context that the prior approaches, called statistical machine translation. Microsoft's researchers also added their own training methods to the system to improve its accuracy -- things they equate to how people go over their own work time and again to make sure it's right.
The researchers said they used methods including dual learning for fact-checking translations; deliberation networks, to repeat translations and refine them; and new techniques like joint training, to iteratively boost English-to-Chinese and Chinese-to-English translation systems; and agreement regularization, which can generate translations by reading sentences both left-to-right and right-to-left. Zhou said the techniques used to achieve the milestone won't be limited to machine translations. The researchers caution the system has not yet been tested on real-time news stories, and there are other challenges that still lie ahead before the technology could be commercialized into Microsoft's products. You can play around with the new translation system here.
The researchers said they used methods including dual learning for fact-checking translations; deliberation networks, to repeat translations and refine them; and new techniques like joint training, to iteratively boost English-to-Chinese and Chinese-to-English translation systems; and agreement regularization, which can generate translations by reading sentences both left-to-right and right-to-left. Zhou said the techniques used to achieve the milestone won't be limited to machine translations. The researchers caution the system has not yet been tested on real-time news stories, and there are other challenges that still lie ahead before the technology could be commercialized into Microsoft's products. You can play around with the new translation system here.
2,000 sentences (Score:1)
Cause that is all it takes.
Re: (Score:2)
Not sure what you mean by this. That was their test set, not their training (nor dev) set. I don't think that's an unreasonable amount of data for a test set.
Re: (Score:2)
Re: (Score:2)
Oh. Yes, I've heard that joke, but I didn't get the connection... thanks for pointing it out.
Behind the scenes... (Score:1)
...it's just an API call to translate.google.com
Okay, but ... (Score:3)
Can it translate a Chinese Reporter's "eye-roll"? 'Cause one apparently broke China's Internet [nytimes.com]
With a fellow reporter’s fawning question to a Chinese official pushing past the 30-second mark, Liang Xiangyi, of the financial news site Yicai, began scoffing to herself. Then she turned to scrutinize the questioner in disbelief.
Looking her up and down, Ms. Liang rolled her eyes with such concentrated disgust, it seemed only natural that her entire head followed her eyes backward as she looked away in revulsion.
Captured by China’s national news broadcaster, CCTV, the moment spread quickly across Chinese social media.
...
On Chinese social media, GIFs and other online riffs inspired by Ms. Liang’s epic eye roll quickly proliferated, and by evening they were being deleted by government censors. Ms. Liang’s name became the most-censored term on Weibo, the microblogging platform. On Taobao, the freewheeling online marketplace, vendors began selling T-shirts and cellphone cases bearing her image.
Re: (Score:1)
Because that's the only way (Score:1)
and even hiring outside bilingual language consultants to further verify the machine's accuracy
Only natives or bilinguals (if they really are) can verify the translaion's accuracy -- until you get your neural networks trained up to that point, of course.
Re: Because that's the only way (Score:2)
Most so-called bilinguals aren't.
Re: (Score:2)
Actually bi-cultural bi-linguals. There are differences in culture which drive the different expressions and translations. Auto-translation is of course a very important tool in global human discourse. The problem, well, the less informed, the less educated, those with far less understanding, will be readily able to communicate with each across the language barrier, think say American Rednecks and Chinese Rednecks, screaming at each other about how their armies can destroy each other and flooding other part
Re: (Score:2)
It's pretty obvious to an English native speaker when a translation is gibberish. A native English-only speaker can't really affirm accuracy, as you stated, but could certainly tell when something is blatantly wrong. They could also at least judge the quality of the final translation's English.
Generally speaking, most translation programs do really horribly at translating idioms, or context-sensitive but otherwise ambiguous phrases. I'd think this is a perfect application for deep learning algorithms to
try the double-reversi test (Score:3)
TFS is missing the important test of accuracy: translate Chinese > English, then back to Chinese. Will any Chinese person be able to understand it? Go back and forth twice for a more serious serious test. If you can't get access to Microsoft's software you can easily try this test with existing software. The results can be comical if your business doesn't depend on accuracy.
Re: (Score:2)
The summary is also faulty, it provides a link and says you can test the tool there, but following the link it says no, it is not the same tool at all, it is a worse tool that is also slower.
I guess we can assume that whatever translation tool the editors are using to write the stories, it was unable to round-trip this story!
Re: (Score:2)
Years ago I tried this out of curiosity. It would typically only take a single round trip to start being funny. After several round trips, you could barely tell what the original topic was. It's like a computerized version of the game "telephone."
No "N" (Score:2)
Othig to see here..........
The water goat strikes again! (Score:3)
I heard a story about an engineering company who used automatic translation to send documents back and forth with their international collaborators. At one point, their engineers were perplexed by the frequent mention of an âoewater goatâ in their correspondence.
After digging through their source documents, they learned that the water goats were in fact hydraulic rams.
Re: (Score:3)
by the frequent mention of an âoewater goatâ in their correspondence.
I'm still perplexed by the frequent mention of "âoe" and "â" on Slashdot.
Re: The water goat strikes again! (Score:2)
iPhone. Curly quotes. Sigh.
Sure but... (Score:2)
...does it censor the letter 'N' as a real Chinese would do it?