Google Teases New AR Translation Glasses
Has Google finally cracked the machine translation conundrum?
Following its I/O presentation mid this week, Google unveiled a nice surprise. The tech giant displayed a brief video of a pair of Augmented Reality glasses that can be used for real-time translations and the translations displayed in the wearer’s field of view. The AR glasses prototype displays an audible language translation in front of the wearer’s eyeballs.
Max Spear, Google product manager, called this translation functionality “subtitles for the world” and in the clip, there is footage of family members communicating with one another, presumably for the first time.
— The Verge (@verge) May 11, 2022
The Google AR Translation glasses still don’t have a name. Form-factor-wise, the glasses look just like conventional prescription glasses. This is a form factor for data glasses that has also been realized by the Canadian smart glasses manufacturer North, which manufactures the Focals data glasses. Google acquired North in 2020 but it scraped the company’s Focals 2 data glasses.
It is possible that what we are seeing is North tech under the guise of translation glasses. Google didn’t reveal when these glasses might be launched as a product. The company refers to the smart glasses as a “prototype” in the video. It’s also unclear whether these glasses might support other functions beyond translation. The demo was focused on the translation function. The North Focals had several functions including navigation, checking emails, and a number of apps.
North’s first data glasses had overall positive reviews but had distribution bottlenecks. Besides, their displays had to be customized for individual buyers where one had to pay a visit to a North store with a 3D scan. The company didn’t sell enough of these and was soon plunged into financial troubles, ending with a Google acquisition.
If Google’s Translation glasses are an offshoot of Focals, then the biggest innovation in them might probably be the software that they run. Google has done considerable research on real-time AI-supported text transcription over the years. This is the kind of real-time transcription used on YouTube and Google’s newer Pixel smartphones. Real-time AI-powered translations have long been available on the Pixel smartphones. The software innovations have now been brought to bear in Google’s real-time AR translation glasses.
Google CEO Sundar Pichai expressed a strong commitment to AR glasses following the presentation of the translation glasses, describing AR as “the next frontier of computing” whose full potential will only be attained in the glasses format, a vision that is also shared by Facebook boss Mark Zuckerberg.
The majority of the browsing public is already quite familiar with Google Translate. While it gives a degree of accuracy, particularly with languages in the same family such as translations from French, German, Spanish, or Dutch into English, it struggles with some language translations. Last year, I was attempting to translate Amharic text into English and the translations were often completely off, often unintelligible, and in some cases, gave the opposite of what was actually meant. So can we trust a Google Translation tool to give us an accurate translation of conversations in real time and break down the global language barriers?
Google began marketing real-time translation as far back as 2017 which was one of the features of the original Pixel Buds. That had been a flop.
Google’s latest translation showcase with its AR glasses appears to be flawless and fluid. Google has said that the translation inside these AR glasses will happen inside the glasses.
The AR translation glasses appear more focused than the company’s flopped Google Glass AR glasses. These glasses are simply meant to display translated text and they don’t pretend to offer users an ambient computing experience or a mixed reality experience capable of replacing smartphones or even laptops as Meta aspires to with its Project Cambria headset.
AR glasses still face plenty of technical challenges before they reach maturity. Even minimal ambient lighting makes it extremely difficult to make stuff out in the smart glasses’ see-through screens. If you think of how tough it is to read subtitles on a TV screen when there is some glare, one can imagine how considerably harder it can be with see-through AR glasses when you also have to simultaneously engage in conversation with someone speaking a different language and whose conversation has to be translated and read in real-time inside the AR glasses.
However, as The Verge reports, even if Google has managed to surmount the display technology challenges that have plagued other AR manufacturers, conversing through a translation app, even in the best of times, is still a hard prospect. The parties have to talk slowly and deliberately and with clarity for the translation tool to capture all the words otherwise, you are likely to have a sketchy translation. Even when speaking slowly and clearly, it is still possible for some of the translations to come out wrongly or awkwardly and be misconstrued. When speaking to machines, one, therefore, has to use much simpler sentences for a reliable machine translation.
In fact, according to tweets by Sam Ettinger and Rami Ismail, Google’s Translate presentation had several incorrect scripts and these had to be corrected for the YouTube version of the keynote. These mistakes, though not pervasive, show that Google’s real-time translation glasses are still far from perfect.
The translation glasses want to solve a fairly complex problem that many AI algorithms are yet to crack even in textual translations. While it is easy to translate words, it is not easy to work out the grammar with machine translations. The machine translations also have to reckon with the complexity of languages which extends beyond words and grammar. There are nuances that even the finest machine translations might not capture. Besides, someone may speak one language and borrow words from another language or from regional dialects mid-sentence and while a human being can easily parse such irregular speech, it might not be easy for a machine such as Google’s translation glasses to work that out. Conversations can also include lots of irregular aspects such as innuendo, unclear references, and incomplete thoughts that are not easily translatable by machines.
While machine translations are used widely and are improving considerably, they “do not speak human yet” as The Verge puts it.