It would be useful to translate text directly in the browser. This could simply rely on existing browser capabilities.
> translate('Hello!', 'en', 'it');
This would open new possibilities to web apps, that could control how, when and what is translated.
Another, very simple, API that would be useful for translations is the ability to trigger the browser translation prompt programmatically:
// on click on a flag inside the page...
This should be straightforward in Chrome, and it’s strange that is not already available.
I know that Google’s Chrome uses machine translations for their “auto-translate” features. But that seems a built-in feature of Chrome (Firefox doesn’t prompt me). I presume that’s because Google also has Google Translate while Firefox has not related translation service.
I understand where you’re coming from, but I think it would be problematic for this to exist as a specification.
- Where does the translation engine exist? In the browser or an external service?
- If the translation engine lives in the browser, which languages? And how are their rules-engines and corpus loaded?
- If this API is just a pointer to a service, is it the end-user or developer that decides the service to use?
I also struggle with use-case here. If an organization needs copy translated for their website, why would they risk on-the-fly machine translation where they have no oversight into what’s translated? It seems risky to entrust the browser for translating their content. This kind of API would make translation a developer’s job rather than a real human translator’s.
- The API is exposed by the browser. Then the browser is free to decide how to implement that (and would not be part of the spec).
- They can use any external translation service if they want.
- No, it’s not the developer that decides the service to use, otherwise it would be useless to define an API. The API is meant to access the browser capabilities (when present) and provide a better UX.
risk on-the-fly machine translation
It would be especially useful for dynamic data, like chats. Or think about the translation of user generated content (like descriptions in Instagram posts).
I’m thinking of how the speech recognition API works in Chrome. What you’re proposing could be very similar to that.
I could see an API that’s built very similarly:
- translate start/end events
- result event
SpeechGrammar object that includes
src and possibly even
weight that can help us assess accuracy of the translation
I’d also want this to be async.
Additionally, developers and possibly end-users should be able to choose translation services.
Chrome and Edge already have a translation feature and iOS Safari is going to have automatic translations (starting from iOS 14).
It would be great to have more control on this.
Basic support: trigger the translation prompt programmatically, for example when the user clicks on a different flag or language select inside the page. This would improve the UI and would be very simple to implement.
translate('Hello!', 'en', 'it');
I’m still iffy on automatic translations. If I were an enterprise-level company, I wouldn’t want my content being shipped off to some automatic translation service that may not get it right.
That aside, I can see value in having a robust API that both the browser and I can use to control translations.
I would want a localization ObjectModel. Just like how there’s one for CSS (CSSOM) and the document (DOM), I’d want one whose whole job was localizations (LOM):
Localization has properties that interface with the document which indicate:
language. String. It’s the source language of the document
rtl | ltr;
- hasLocalizations: boolean. Indicates if there is at least one localization for the document
localizations. Map. Key is the name of the localization language, Value is a collection of
LocalizationNodes and content.
localization . String. the language name of the current translation
It has an array-like collection of text nodes or content nodes (some interface that indicates this is a localizable bit of content). I’d call this property
LocalizationNodes that indicate:
language : String. inherited from the document, unless it’s set as an attr on the element
textDirection String. (inherited from the document, unless it’s set as an attr on the element)
content String? the
innerText of the element
Map where the key is a string that’s the target language, and the value is the translated text)
hasLocalization . Boolean. Indicates if there’s at least one localization.
localization. String, if a localization is displaying, this is that “target language”.
LocalizationNode have methods for translations, which are what you describe:
translate(targetLang, sourceLang, content) which sends content and returns a promise which resolves to a
Localization. When it resolves, it sets the element’s
true and it adds a key-value pair to the element’s
localizations map. These are also attributes on the element:
<button hasLocalization>Click here!</button>
targetLang is the target language. Required.
sourceLang is the source language. Optional. Can be inferred from document or element attribute
content is the the text to translate. Optional. Can be inferred from
showLocalization(localization) which flips the text inside a node from the original content to the localized content. This puts a
localization attribute on the element, and sets the value of that attribute to the value set for this parameter. e.g.
<button localization="US-es">¡Haz click! </button>
showSource() which flips back to the original content. it removes the
localization attribute from the element:
localization would match the
targetLang sent into the
translate method .
The Localization would also have methods that mirror the methods on each
translate(targetLang, sourceLang, content) Returns a promise. When it resolves it sets
true on the
Localization root node. so, in the HTML we’d see
<html lang='us-en' hasLocalization>
targetLang is the target language. Required.
sourceLang is the source language. Optional. Can be inferred from from
lang= on document
content is the the text to translate. Optional. . If not provided, then it’s an array of
LocalizationNode.content. If it is provided, then … IDK… i guess it overwrites it? Not sure about this here.
showLocalization(localization) This is simple. It triggers a
showLocalization() on every
LocalizationNode. The HTML would look like this when triggered:
<html lang="us-en" localization="us-es">
setService(url, callback) Not sure about this, but it’d be some sort of ability to identify if there’s a preferred API, and I guess it’d need an optional callback if the service didn’t just do a 1:1 text translation and you needed to filter something out. But all
localizationNode would inherit this.
This is all super speculative and theoretical. But my TL;DR is
We need a Localization Object Model
The Localization Object Model should be a wrapper for easily grabbing/converting content in the browser
We should be able to tell the end user:
- we have an available translation
- you have the option to see that translation
- you can switch back to the source easily
We need separate methods for translating and displaying the translation , so that these things can happen asynchronously
Any updates on this?
Now that all major browsers support automatic translations, it would be awesome to have a standard API.
It is strange that everything is exposed by browsers as an API and the translations happen automatically, without control, displaying banners and changing the content of the websites… Giving control to developers would allow more sophisticated user experiences.
E.g. translate a message in a chat, translate the description of a specific social media post, give users the option to choose the language for a restaurant menu, etc.