A partial archive of discourse.wicg.io as of Saturday February 24, 2024.

Multilingual text rendering for web maps

AmeliaBR
2020-09-28

At the Maps for the Web workshop, Brandon Liu introduced some of the issues specifically with CJK text rendering in map labels, and how this can be a problem with multilingual maps

Although Brandon focused specifically on CJK rendering, I know that multilingual text issues don’t stop there, so this discussion can probably be wider than that — what are the problems with web maps when it comes to rendering text in different languages?

AmeliaBR
2020-09-28

Here are the comments from workshop participants who watched the session live:

  • Bryan Haberberger @thehabes: Oh man I hate when my map shows random windings because someone didn’t get UTF-8 (or 16) right.

  • Fred Esch @fredesch: lang attribute?

  • Amelia Bellamy-Royds @AmeliaBR: Fred, my thought too — that’s definitely the HTML way to do that, but I guess the issue is that it’s not always included in the geospatial data formats.

    And there we go…

  • Bryan Haberberger @thehabes: We see that same need in JSON data

  • Fred Esch @fredesch: Most maps use a single language on a map

  • Bryan Haberberger @thehabes: Especially for things like label

  • Bryan Haberberger @thehabes: How could we support the same label, but multiple language, and offer choice?

  • Fred Esch @fredesch: An odd use case is European towns that have different names in different languages - like the village has a Czech name and a German language

  • Bryan Haberberger @thehabes: Fred gets it

  • Satoru Takagi @satakagi: There is a trend around me to stop creating multi-lingual pages these days because of the advances in automatic browser translation capabilities.

  • SebastienDurand @SebastienDurand: the standard on locales can support a text string that define more then one language (if I am not wrong and also define default language…)

  • вкαя∂εℓℓ @briankardell: seems like you would use Intl for that in canvas, no?

    ecma 402 has a lot of investment

  • Bryan Haberberger @thehabes: @satakagi it is interesting how much that has helped these kinds of pipelines

  • Doug Schepers @shepazu: (Fred, that’s an exonym and endonym, FYI)

  • Amelia Bellamy-Royds @AmeliaBR: This isn’t an Intl thing, @briankardell, because it’s not about the output character string, it’s about the glyphs that are selected from the font.

  • вкαя∂εℓℓ @briankardell: I see

    the web vtt stuff is something that needs pie :slight_smile:

  • Amelia Bellamy-Royds @AmeliaBR: So, the main message I’m getting is the Firefox is best for locale-aware CJK text rendering

  • Peter Rushforth @prushforth: @bdon great and informative presentation!!! Eye opening for me

  • Bryan Haberberger @thehabes: Yes, the user mainly, but the developer may want to pick different primary language for different scenarios to show by default

    Yep, like @bdon is saying, its a bit for the user, a bit for the developer, a bit for the machine

  • SebastienDurand @SebastienDurand: Having a solution to this multi-language situation would be invaluable for Canada and Belgium or any other countries or organisations who has to makes products in different languages.

    E.g.: to answer for COVID-19 in Canada if 10 web map had to be created that equated to 20… because of the language (EN and FR) effort and resources are limited. In this time an age one would expect a single solution should be produce and text should simply adapt to the user requirements.

  • Peter Rushforth @prushforth: Indigenous languages also need to be supported in Canadian maps

bdon
2020-10-02

This GitHub issue: What about Han unification? #2208 mentions languages and scripts outside of Han:

  • There are four variants of the Cyrillic alphabet: Russian, Bulgarian, Serbian and Macedonian.
  • The character ‘LATIN CAPITAL LETTER ENG’ (U+014A) has different shapes in African and European languages.
  • The Syriac script (ISO 15924: Syrc/135) has the variants “Syriac (Eastern Variant), ISO 15924 Syrn / 136” and “Syriac (Western Variant), ISO 15924 Syrj / 137” and “Syriac (Estrangelo variant), ISO 15924 Syre / 138”.

Separately, for layout: it is worth looking at the W3 draft of Mongolian Script Layout Requirements

nchan0154
2020-10-03

Regional variants on similar glyphs is such a fascinating and difficult topic to implement. I have always wondered about the details of CJK rendering and it was so valuable to listen to someone speak on the current challenges and barriers to a multilingual web. Thank you!

Malvoz
2020-10-03

There is a lot of relevant documentation on Layout requirements (and more) in https://www.w3.org/blog/international/