How Facebook Mis-Captioned the Launch of a NASA Supply Rocket (arstechnica.com) 37
An anonymous reader quotes Ars Technica:
An Antares rocket built by Northrop Grumman launched on Wednesday afternoon, boosting a Cygnus spacecraft with 3.4 tons of cargo toward the International Space Station. The launch from Wallops Island, Virginia, went flawlessly, and the spacecraft arrived at the station on Friday. However, when NASA's International Space Station program posted the launch video to its Facebook page on Thursday, there was a problem. Apparently the agency's caption service hadn't gotten to this video clip yet, so viewers with captions enabled were treated not just to the glory of a rocket launch, but the glory of Facebook's automatically generated crazywords...
Some of the captions are just hilariously bad. For example, when the announcer triumphantly declares, "And we have liftoff of the Antares NG-11 mission to the ISS," the automatically generated caption service helpfully says, "And we have liftoff of the guitarist G 11 mission to the ice sets."
There's more examples in the photos at the top of their article -- for example, a caption stating that the uncrewed launch "had a phenomenal displaced people at 60 seconds," and translating the phrase "TVC is nominal" to "phenomenal."
While the lift-off announcer does use what may be unfamiliar names for the rockets, along with other technical jargon, the article points out that YouTube's auto-captioning of the same launch "seemed to have no problem with those bits of space argot."
Some of the captions are just hilariously bad. For example, when the announcer triumphantly declares, "And we have liftoff of the Antares NG-11 mission to the ISS," the automatically generated caption service helpfully says, "And we have liftoff of the guitarist G 11 mission to the ice sets."
There's more examples in the photos at the top of their article -- for example, a caption stating that the uncrewed launch "had a phenomenal displaced people at 60 seconds," and translating the phrase "TVC is nominal" to "phenomenal."
While the lift-off announcer does use what may be unfamiliar names for the rockets, along with other technical jargon, the article points out that YouTube's auto-captioning of the same launch "seemed to have no problem with those bits of space argot."
Youtube video (Score:1)
The captions in the youtube video also have many small errors (interstate instead of interstage for example).
Admittedly, the sound quality is somewhat low, there is little context (many short sentence fragments), and there's a lot of very technical jargon and abbreviations. Try giving the audio track to a human who is not familiar with rocket jargon, and I bet you get plenty of bad translations as well, especially if you don't give them a chance to listen to the whole recording a few times, but force them
Transcription and training sets (Score:5, Insightful)
Admittedly, the sound quality is somewhat low, there is little context (many short sentence fragments), and there's a lot of very technical jargon and abbreviations
Technical jargon really shouldn't be a big problem for an appropriate translation system. My wife is a pathologist and they use transcription services (both human and automated) all the time to reasonably good effect. Still requires some amount of proofreading by the author in most cases as a precaution (it is a medical/legal record after all) but it's pretty good. YouTube has a HUGE training set of videos about almost literally every aspect of rocket science you could ask for. Someone just has to do the actual work to train their system properly.
Try giving the audio track to a human who is not familiar with rocket jargon
Why would you do that? It's not like rocket launches are some new thing. YouTube and others have had plenty of time to work on the problem and they have plenty of data to utilize to train their system appropriately.
I bet you get plenty of bad translations as well, especially if you don't give them a chance to listen to the whole recording a few times, but force them to transcribe in real time.
Of course you would but what would that prove? And besides, humans do transcribe in real time in court rooms [wikipedia.org] with fairly good fidelity on a daily basis. No transcription service is perfect but I think a lot of the errors in systems like the one described here have a lot more to do with the poor coding and/or training of the system being used than anything else. It's not like the system doesn't have any clues to the context of the video and it has a huge training set of all the technical jargon they could hope for. They seemingly just can't be bothered to actually work on the problem.
Re: (Score:2)
Why would you do that? It's not like rocket launches are some new thing. YouTube and others have had plenty of time to work on the problem and they have plenty of data to utilize to train their system appropriately.
You underestimate the problem. It's not just about rocket launches. It's a general problem with any domain specific video. Also, the context can abruptly change from one scene to another.
YouTube has a HUGE training set of videos about almost literally every aspect of rocket science you could ask for
In order for a machine to learn the correct jargon, you need more than just video of rocket launches. You need someone to go through, and manually transcribe all the jargon correctly. In addition, the machine would have to learn how to extract context from various clues, including visual ones, which is a completely differe
Transcription is hard but not impossible (Score:2)
You underestimate the problem.
Not at all. I'm well aware of the substantial challenges involved. I'm also well aware of how well it can be done given the current state of the art. This just isn't a good example of doing it well.
It's not just about rocket launches. It's a general problem with any domain specific video.
In a lot of ways, domain specific translation can be easier in some use cases because the language tends to be quite specific and there is often less context sensitive syntax to worry about. My wife sometimes uses a dictation system for pathology where the transcription is automated. It's not perfect (neithe
Re: (Score:2)
and the title of the video to help.
What if the title is "Funny: compilation of poorly transcribed videos."
Re: (Score:3)
IME they do a decent job at technical translation because I can tell what word the person probably would have said.
What I really enjoy are the automatic translations of Korean shows. Koreans are very poetic communicators, so even if the computer figures out what they literally said, it has no hope to translate it. But the computer won't do even that well. And it won't notice that every sentence has one English word and one Japanese word, so those turn into random wildcards.
Re: (Score:2)
This should be upvoted. Many of the mistakes are ones that a human captioner might make. I can't believe this is what we are calling hilariously bad, considering what the state of autocaptioning was just four years ago. You can pretty much puzzle out what is being said, I don't think it comes close to "making a mockery of a serious scientific endeavor" as the writer implies.
Facebook is what the mob would be... (Score:3, Insightful)
That would mean nobody gets to watch it (Score:3)
Correct me if I'm wrong, but I was under the impression that the cost of accurate captions from minute one was prohibitive for many. Requiring accurate captions from minute one as a condition of publishing live video of an event is a good way to ensure that nobody gets to watch the event. Would you call University of California Berkeley taking down 20,000 college lectures in 2017 [slashdot.org] a net gain?
Re: (Score:2)
Berkeley took the videos down because doing so was a simple way to destroy the lawsuit against them, and they had people willing to archive them. There was little advantage to having them all on YouTube, a commercial platform.
Re: (Score:2)
There are rules that require accuracy in captions. Failure to meet these requirements is a serious violation of Federal Law.
Nope.
If you don't know what the rule is, it was probably invented by the waving hands of your AM radio DJ.
13 years of progress (Score:2)
Re: (Score:2)
Give Zuck SOME props (Score:1)
Re: (Score:2)
Admit it, you like stories about zuck ups. I can sense it.
Recruiter calls on Google Voice (Score:2)
This sounds a bit like what I experience during my job search as off-shore and heavily accented recruiters leave messages to my Google Voice.
GV attempts to translate and transcribe the call before sending them to my email and texting me. It makes for some rather hilarious moments, to say the least, during the stressful job search.
Conspiracy Theories... (Score:2)
...about the Facebook captioning being accurate and NASA sending a phenomenal guitarist (probably Brian May, he's certainly qualified in multiple ways) into space coming in 3...2...
Captions brought to you ... (Score:2)
OCR training (Score:2)