Translation API

It would be useful to translate text directly in the browser. This could simply rely on existing browser capabilities.


> translate('Hello!', 'en', 'it');

This would open new possibilities to web apps, that could control how, when and what is translated.

1 Like

Another, very simple, API that would be useful for translations is the ability to trigger the browser translation prompt programmatically:


// on click on a flag inside the page...

This should be straightforward in Chrome, and it’s strange that is not already available.

I know that Google’s Chrome uses machine translations for their “auto-translate” features. But that seems a built-in feature of Chrome (Firefox doesn’t prompt me). I presume that’s because Google also has Google Translate while Firefox has not related translation service.

I understand where you’re coming from, but I think it would be problematic for this to exist as a specification.

  1. Where does the translation engine exist? In the browser or an external service?
  2. If the translation engine lives in the browser, which languages? And how are their rules-engines and corpus loaded?
  3. If this API is just a pointer to a service, is it the end-user or developer that decides the service to use?

I also struggle with use-case here. If an organization needs copy translated for their website, why would they risk on-the-fly machine translation where they have no oversight into what’s translated? It seems risky to entrust the browser for translating their content. This kind of API would make translation a developer’s job rather than a real human translator’s.

  1. The API is exposed by the browser. Then the browser is free to decide how to implement that (and would not be part of the spec).
  2. They can use any external translation service if they want.
  3. No, it’s not the developer that decides the service to use, otherwise it would be useless to define an API. The API is meant to access the browser capabilities (when present) and provide a better UX.

risk on-the-fly machine translation

It would be especially useful for dynamic data, like chats. Or think about the translation of user generated content (like descriptions in Instagram posts).

I’m thinking of how the speech recognition API works in Chrome. What you’re proposing could be very similar to that.

I could see an API that’s built very similarly:

  • translate start/end events
  • result event
  • a SpeechGrammar object that includes src and possibly even weight that can help us assess accuracy of the translation

I’d also want this to be async.

Additionally, developers and possibly end-users should be able to choose translation services.

Chrome and Edge already have a translation feature and iOS Safari is going to have automatic translations (starting from iOS 14).

It would be great to have more control on this.

Basic support: trigger the translation prompt programmatically, for example when the user clicks on a different flag or language select inside the page. This would improve the UI and would be very simple to implement.


Advanced support: use Javascript to translate some specific text (or the text inside specific elements) from a language to another language.

translate('Hello!', 'en', 'it');
// or 
DOMElement.translate('en', 'it');
1 Like

I’m still iffy on automatic translations. If I were an enterprise-level company, I wouldn’t want my content being shipped off to some automatic translation service that may not get it right.

That aside, I can see value in having a robust API that both the browser and I can use to control translations.

I would want a localization ObjectModel. Just like how there’s one for CSS (CSSOM) and the document (DOM), I’d want one whose whole job was localizations (LOM): Localization

  1. Localization has properties that interface with the document which indicate:

    • language. String. It’s the source language of the document
    • textDirection. String. rtl | ltr;
    • hasLocalizations: boolean. Indicates if there is at least one localization for the document
    • localizations. Map. Key is the name of the localization language, Value is a collection of LocalizationNodes and content.
    • localization . String. the language name of the current translation
  2. It has an array-like collection of text nodes or content nodes (some interface that indicates this is a localizable bit of content). I’d call this property LocalizationNodeList

  3. The LocalizationNodeList have LocalizationNodes that indicate:

    • language : String. inherited from the document, unless it’s set as an attr on the element
    • textDirection String. (inherited from the document, unless it’s set as an attr on the element)
    • content String? the innerText of the element
    • localizations (a Map where the key is a string that’s the target language, and the value is the translated text)
    • hasLocalization . Boolean. Indicates if there’s at least one localization.
    • localization. String, if a localization is displaying, this is that “target language”.
  4. The LocalizationNode have methods for translations, which are what you describe:

    • translate(targetLang, sourceLang, content) which sends content and returns a promise which resolves to a Localization. When it resolves, it sets the element’s hasLocalization to true and it adds a key-value pair to the element’s localizations map. These are also attributes on the element: <button hasLocalization>Click here!</button>

      • targetLang is the target language. Required.
      • sourceLang is the source language. Optional. Can be inferred from document or element attribute
      • content is the the text to translate. Optional. Can be inferred from content.
    • showLocalization(localization) which flips the text inside a node from the original content to the localized content. This puts a localization attribute on the element, and sets the value of that attribute to the value set for this parameter. e.g. <button localization="US-es">¡Haz click! </button>

    • showSource() which flips back to the original content. it removes the localization attribute from the element: <button hasLocalization>

      • localization would match the targetLang sent into the translate method .
  5. The Localization would also have methods that mirror the methods on each LocalizationNode:

    • translate(targetLang, sourceLang, content) Returns a promise. When it resolves it sets hasLocalization to true on the Localization root node. so, in the HTML we’d see <html lang='us-en' hasLocalization>

      • targetLang is the target language. Required.
      • sourceLang is the source language. Optional. Can be inferred from from lang= on document
      • content is the the text to translate. Optional. . If not provided, then it’s an array of LocalizationNode.content. If it is provided, then … IDK… i guess it overwrites it? Not sure about this here.
    • showLocalization(localization) This is simple. It triggers a showLocalization() on every LocalizationNode. The HTML would look like this when triggered: <html lang="us-en" localization="us-es">

    • setService(url, callback) Not sure about this, but it’d be some sort of ability to identify if there’s a preferred API, and I guess it’d need an optional callback if the service didn’t just do a 1:1 text translation and you needed to filter something out. But all localizationNode would inherit this.

This is all super speculative and theoretical. But my TL;DR is

  1. We need a Localization Object Model

  2. The Localization Object Model should be a wrapper for easily grabbing/converting content in the browser

  3. We should be able to tell the end user:

    • we have an available translation
    • you have the option to see that translation
    • you can switch back to the source easily
  4. We need separate methods for translating and displaying the translation , so that these things can happen asynchronously

Any updates on this?

Now that all major browsers support automatic translations, it would be awesome to have a standard API.

It is strange that everything is exposed by browsers as an API and the translations happen automatically, without control, displaying banners and changing the content of the websites… Giving control to developers would allow more sophisticated user experiences.

E.g. translate a message in a chat, translate the description of a specific social media post, give users the option to choose the language for a restaurant menu, etc.