The entertainment space is going global. Thanks to rapid technological shifts, content is now available on every type of screen in every region around the world. But while content now has universal reach, there is no “universal” audience. Each region watches content differently – from what they love to watch to the screen they love to watch it on.
As content is distributed more globally, localizing that content — translating and adapting it to local preferences – has never been more important. More translations are needed, more quickly and at a higher quality — all while understanding and maintaining the director’s original intent. For content owners, this can be the differentiator that determines whether audiences watch their shows or subscribe to their platforms.
The goal in localization is to ensure when non-English-speaking audiences are watching video content, they can still relate to the story, laugh at all the jokes or cry after a touching moment. It is a complex task that is filled with nuance, but one that has been made increasingly more efficient with new technology.
AI and machine translation (AI/MT) have made inroads into the entertainment space in recent years with the power to automate translation and transform workflows. However, while MT can improve translations, the nuance and complexity of localization creates challenges that still require a human touch to achieve the quality that consumers expect.
The (Not Quite) Rise Of The Machines
Since its introduction in the 1950s, machine translation has benefitted industries and individuals alike, whether it helps translate instruction manuals for mass production or helps a tourist ask for directions in Japan. Today, MT systems learn through connected neural networks, constantly training and adapting to improve the quality of outputs. MT quality is improving at incredible rates, but is not advanced enough to fully automate translation of creative content.
When it comes down to it, what MT brings to today’s translation workflows is speed. A script can be translated by computers instantly, shaving much-needed time off the localization process as an aide to human translators. The translator starts with a machine-translated script, edits it to correct mistranslations, adapts it to local preferences and fixes the timing and line breaks. After the process is complete, the content can then be fed back to a neural network to help improve the quality of future translation outputs.
What’s Lost In Machine Translation?
Currently, there are no complete MT solutions that can interpret the nuances and subtext in creative content the way that humans can. There’s still a need for this technology to work in tandem with humans for translating creative content, a reality that can sometimes be overlooked in the excitement surrounding AI’s transformative effect on the industry.
As far along as we are in MT technology, languages and cultures are intrinsically layered. For example, a character might make a seemingly inflammatory statement in jest. Or, they can say something pleasant and complimentary, but with a malicious undertone. In order for us to understand subtext in dialogue we interpret not only words, but tone of voice and facial expressions and then make educated assumptions based on what we know about the plot, character traits and the relationship between characters. Accurately localized content requires that a translator, or a translation system, understands the subtext and assumptions that are made. Today’s commercial MTs are not able to delineate these emotions or traits, especially when translating directly from text.
Recently, the industry has struggled heavily with quality of Korean translations because the grammar is heavily swayed by Korea’s comprehensive system of honorifics. A Korean translation can differ if the speaker is one day older or one day younger than the listener. Every language and culture has nuances like this that even the best MT technology is currently unable to accurately localize.
What may be common vernacular and culturally acceptable in one territory may not apply in another. Even one unnaturally segmented subtitle break can pull a viewer out of their suspension of disbelief. That’s part of the reason the localization of creative content is such a time intensive process – translations need to go through multiple revisions before producing the most successful result.
A Way Back To The Future
Audiences want to emotionally connect to the reality of what they see. Filmmakers work tirelessly to create a world that audiences can connect to. The fastest way to lose that connection is through inaccurate localization. MT isn’t there yet, but it‘s role in the process is growing exponentially.
While some experts estimate that MT is at least a couple decades away from reaching parity, the next few years will see some rapid changes thanks to the technology and efforts of engineers and subject matter experts. To improve the process, different neural networks will feed into each other, learning object recognition, facial recognition, emotional intelligence, colloquialisms, and more to improve the accuracy of translation. MT is only as accurate as the data fed into it, and luckily the entertainment industry has decades of successful, human-made translations to train from.
As the technology improves, so will the industry’s understanding of localization, ensuring audiences around the world can press play on an experience that brings everyone together, rather than highlighting small nuances that divide us.
Greg Taieb is the Vice President of Product Development and Innovation at Deluxe Entertainment (www.bydeluxe.com/en/).