Add new character entity references like &power; - to match new Unicode specs

edent · 2014-06-13

I’m one of the creators of http://unicodepowersymbol.com/ - the campaign to get the IEC 60417-5009 power symbol into unicode.

We’ve successfully got the following characters into the next version of the Unicode specification ⏻ ⏼ ⏽ ⭘ 🌭

At the moment, the “easiest” way of writing the power symbol in HTML is to use the escape code ⏻.

I propose that the following entity references be created.

&power;
&powertoggle;
&poweron;
&poweroff;
&powersleep;

This will make it easier for web developers to include these symbols in online documentation etc.

Boldewyn · 2014-06-13

I don’t think, that we need those specifically. Thanks to more and more UTF-8 adoption you can put the character in there literally. And there are codepoints far oftenly used, that would qualify much more for named entities.

That is, a mechanism to enter arbitrary codepoints by their name might be a better idea. Like:

&"OHM SIGN";
&"ZERO WIDTH JOINER";

You get the idea. Final notation is of course debatable.

tabatkins · 2014-06-13

As @Boldewyn says, “just write the character” is the preferred answer these days. There are few systems which can handle HTML but not UTF-8, and none of them are browsers.

Boldewyn · 2014-06-13

I should mention, that I followed the campaign closely and I think, it was a great example of how to get something useful into Unicode. I am also more than eager to add the characters to codepoints.net, as soon as the official standard is published.

But HTML entities are simply a different beast. Look at MathML. They specced like a 100 entity references, that can almost all used directly with UTF-8 and system default fonts nowadays. But when the HTML5 spec writers wanted to include MathML in the standard, they were forced to add the MathML character entities as well. Therefore my enthusiasm for adding yet new references is limited…

robin · 2014-06-16

Yup, my instinct is to just use the characters, but on the off chance that I may have been missing some developer ergonomics I asked on Twitter. At this point the reactions are almost overwhelmingly against, for the most part with good reasons.

Richard’s suggestion to add a few for invisible characters not yet captured has some appeal, though.

edent · 2014-06-18

Very reasonable points all. My only question is how people type in common symbols if they don’t appear on a standard keyboard?

I suppose copy & paste still rules.

robin · 2014-06-18

It entirely depends on your keyboard and system. On OSX, the Character Viewer is a classic (there are equivalents elsewhere). Another option which some people use is Kotoeri input. Originally meant for Japanese, it does help simply typing up symbols as well (but is only worth it if you do so often enough).

Boldewyn · 2014-06-18

There’s a whole Wikipedia article dedicated to it

mathias · 2015-02-27

Just either use the raw symbols:

⏻ ⏼ ⏽ ⭘ 🌭

Or escape them based on their code points, e.g.:

&#x23FB; &#x23FC; &#x23FD; &#x2B58; &#x1F32D;

These solutions already work today and are backwards compatible with existing implementations. Adding new named character references causes needless compatibility issues.