Relaxing restrictions on custom attribute names


#1

As far as I understand, any user-defined attribute names on normal HTML tags must always start with data-.

However, some JS frameworks disregard this part of the standard. E.g. Angular’s ng-* — even though it is possible to use correct data-ng-*, it does not seem to be used often, potentially because of extra verbosity. In other frameworks, e.g. Vue, it’s not even possible to use the correct prefix anymore.

Given the popularity of those frameworks, should the standard be changed to allow any custom attribute names — as long as they contain a dash (same as custom elements)?

Obviously, existing attributes with dash have to be reserved, but I only found two: accept-charset and http-equiv, so it seems simple enough.


#2

For the purpose of avoiding verbosity, using the shorter x- prefix instead of data- would probably be enough.

Alternatively, any custom attributes could be allowed as long as specific custom attributes (e. g. foobar) or prefixes (e. g. ng-) are declared explicitly — probably using something like the same CSS-like syntax I’ve proposed for custom elements back in 2011:

@custom {
    * {
        ng-*; /* Allowing `ng-` prefixed custom attributes for all elements. */
    }

    LOREM {
        foobar; /* Allowing `foobar` attribute for `LOREM` elements. */
    }
}

#3

Why not just take a descriptive approach here, and allow what people are already using?
I thought that was the approach with HTML5+ anyway — describe and extend what the web does.

Is there a significant downside in following the same rules as custom elements?


#4

I agree with @ashmind, relaxing it to the same rules as custom elements makes a lot of sense.

The main reason originally for this was to prevent pollution of the attributes that would prevent the standard being able to add new ones. I already namespace my attributes so can end up with elements that have multiple attributes like “data-foo-hideable data-foo-someproperty data-foo-somethingelse”. The data- adds nothing but can make things harder to read once it starts stacking up.

The current situation with libraries like Vue and Angular isn’t harmful in terms of polluting the attribute namespace and they went that direction because it improves readability by dropping unnecessary verbosity.

I think it is reasonable to relax the rules to the following:

  1. Custom attribute names must contain a hyphen
  2. The prefix must be at least 2 characters long
  3. The prefix must not be in the list of reserved attribute prefixes
  4. Frameworks and libraries should use the same prefix for all their custom attributes

1 Designates the way everyone is actually doing it anyway.

2 is more to avoid an excessive risk of namespace collisions. It’s no less enforceable than the current ‘data-’ prefix issue but is a more reasonable ask for libraries to adhere to (eg libraries like Vue would likely be happy making a change from eg v- to vue- on the basis that it’s not pointless verbosity like data-v- would be).

3 is there to handle things like ARIA. It is possible that the spec may occasionally step on a namespace being used somewhere but adding a new attribute namespace is rarer than adding attributes by the W3C and a namespace is less likely to need to be a specific word as a single attribute would be.

4 again is very much how things are being done now but it’s good guidance to build in anyway.


#5

Would be great, if it’s doable.

At one time I had the same idea and looked up existing HTML5 attributes, and the two or 3 with an hyphen seem to be old attributes from HTML 1-3. Newer attributes after that were hyphen-less.

The main exceptions would be:

  • the data- prefix
  • the aria- prefix

I’m also wondering about SVG presentation attributes: https://www.w3.org/TR/SVG2/attindex.html#PresentationAttributes

That’s a lot of attributes, possibly applying to elements such as a, audio, video and iframe.


#6

2 or 3 old HTML attributes would be fine for adding them to the list of reserved prefixes but SVG could well be a show stopper. For a start that’s 27 quite generic prefixes that would need to be reserved but worse is that hyphenation is the style for SVG attributes so it’s likely that additions to the spec would want to go that way (whereas adding the equivalent of ARIA in HTML would be a very rare occurrence and easy to pick something relatively unique).

It’s unfortunate that dataset didn’t get the same amount of attention as data- attributes as it would have given a reason to stick to the spec but lack of support combined with little improvement over getAttribute and setAttribute meant little reason other than “because spec says so”.

Either way we’re in a situation where the spec is not reflective of usage and I see the future having a lot more frameworks using namespace- over data-namespace-. Google was really unhelpful using ng- for angular as it’s hard to push for spec compliance when one of the browser implementers isn’t following it.


#7

SVG’s hyphenated attributes are the ones that are taken directly from CSS; whenever SVG adds support for/invents a CSS property, new attributes come along. SVG2 is adding even more and including support for existing properties such as z-index, so this is quite troublesome.


#8

+1. I’ve thought about this for a long time too. I wish that were the case, currently having to use two namespaces results to very verbose attribute names, i.e. data-projectname-foobar, which is why Angular is just ignoring data- altogether. Sure, there are existing attributes with hyphens, but they could become exceptions, just like with custom elements.


#9

I think not. I think the standard should be modified to simply allow us to use any attribute names we want. If you asked ESDiscuss if we should limit function name to start with f and in function fMyFunction() {} people would go nuts. I don’t see why HTML has to be different from that.

But, anyways, all browsers support this already, so it doesn’t matter so much (that’s why we can use any custom attributes we want on our custom elements, otherwise it wouldn’t be as fun). :}

I think that would just complicate things. Browsers currently let us put any attributes we want on elements, and I think that’s fine. It’s up to us to know whether the attribute we’re using is meant to do something (natively) or not. It’s like setting a property on an object in JavaScript: we can set it to whatever we please, and we understand that it’s possible setting the value might do something (and it’s up to us to know what that something is (i.e. read API documentation)).

People keep mentioning this for various parts of the HTML API, but to me this point it moot. Let us override anything we wish, and assume it is our responsibility to read API documentation before we blindly override everything. Programming languages are like that, and I think HTML should be too. For example, one argument against allowing us to define elements with names of already-existing elements is that “it would prevent the standard being able to add new ones”. I wholeheartedly disagree: browser could add anything they want, and users should beable to override them as they please. We can do it in JavaScript (for example, override the property of an object, in some cases the global object), so why not in HTML?

document.registerElement('img', BecauseThisIsMyAppAndIWantTo)
document.registerElement('monster', MonsterElement)

Because, why not? Simply saying that “browsers need to be able to introduce new elements” isn’t a good reason. If we allow native elements to be overridden, then my previous example proves that the feature is future-proof; for example, if browsers introduce a new element called <monster>, the app that runs the previous code will not break because the <monster> tag will continue to use the app author’s logic, not the browser’s native logic.

It would be like in JavaScript: if I have some code that overrides window.setInterval with a custom function, no amount of updates to the browser’s native implementation will ever interfere with mine. Why not let HTML be as flexible and customizable (and as fun and pleasurable to work with) as JavaScript?

I think it is reasonable to relax the rules to the following:

  1. Custom attribute names can be anything, and same with custom element names.
  2. Put a hyphen in it if you feel like it, or don’t.
  3. Frameworks and libraries should choose to do whatever they want to do (like currently?)
  4. Be free.

I think that this list (it’s not really a list, since it really boils down to item 1) can be nicely coupled with the idea of encapsulating things such as custom elements (with any name and with any attributes) inside of ShadowDOM roots. The concept would be similar to what React has introduced: encapsulation of HTML elements (JSX) inside of components (classes that extend React.Component). In this case, the encapsulation of HTML happens inside Custom Elements using ShadowDOM roots. Let a component author decide for him/her-self what an <img> tag means inside of his/her component. We can in React, and the concepts there have been proven based on React’s popularity, and based on the spawning of a number of libraries inspired similar to React.


In conclusion, restrictions beyond the native restrictions on element/attribute naming don’t provide much benefit. I’d love to see a day when we can override any element with out own, or make news ones with any name, and with component encapsulation using Custom Elements + ShadowDOM (or something similar, but ShadowDOM seems to be where it’s heading).

React and similar libraries are the embodiment of what the web does not yet have. I believe in the web’s ability to improve, and I’d love to see the day when HTML can be as flexible, as custom, and as modular as React components (or Angular directives, or Riot.js custom tags) so that we don’t need a 3rd-party library for such a thing. Well, the web doesn’t seem to have any plan for a feature whereby we can repeat elements based on an array of data like those 3rd-party libraries do, but that’s an idea for another day…


#10

We could allow using any custom attributes as long as their names are started with underscore (_) (my preferred option) or hyphen (-):

<div _foo="Hello" -bar="World"></div>

Such prefixes are guaranteedly not used in standard attribute names, so they are automatically future-proof and safe to use for custom purposes while being extremely short (just one character instead of five in case of data-).

Both of the prefixes work in all current browsers (selectors, getAttribute(), setAttribute()), including Firefox, Opera (Blink), Opera 12 (Presto), IE (at least 8+), Edge. So it’s just a matter of documenting existing implementations and removing purely formal limitation without any harm.

I am going to adopt using underscore-prefixed attributes from now on at least for attributes added dynamically via JS and thus not discoverable by static-HTML validators.


#11

Your override example only works because you register the element with JavaScript, or am I wrong? If we allow custom attributes (e.g. foo) without registering them via JavaScript and foo is later added as an attribute in the spec, clients have no way to know whether the attribute shall be interpreted as defined in the new spec or as overriding the new spec.

I also doubt that requiring authors to register their custom attributes with JavaScript would solve this, because we cannot expect clients to interpret the JavaScript part. Hence even attributes that are registered in JavaScript to override spec behaviour are probably going to be interpreted according to the spec quite often.


#12

I’ve thought a bit more on SVG, and I think it’s useful to consider what would currently happen on conflicts.

For example, let’s say ng has some meaning for SVG, and so tomorrow SVG decides to add ng-hide as an attribute to all SVG and HTML elements. Such proposal would never get through — it would break half of the web. If there was a popular framework using z- as a prefix, same would apply to z-index.

So SVG attributes are somewhat limited by existing frameworks, but it also limits any new frameworks — which does not seem like the right approach.

I suggest the following as the future-safe path:

  1. SVG should use a prefix for its attributes when used on HTML elements (same as ARIA does). Obviously older attributes can’t be re-prefixed, but those can be marked as deprecated in the long term.
  2. Any attributes names are allowed, as long as they have a non-alphanumeric character (originally I suggested -, but there is Angular 2 which is going its own way yet again)
  3. Conflicts between frameworks and things like SVG/ARIA will happen only on prefix basis, which wouldn’t be hard to resolve.

#13

We can re-prefix existing SVG attributes just so long as the old ones are kept as aliases, similar to : and :: for pseudo classes.

That Angular thing is really annoying though, you’d have thought Google would have paid lip service to sane attributes at least…


#14

You know, that’s not so bad. _foo and-foo are easier to write than data-, and I like that much more. I hadn’t realized how great of a problem a conflict could be. Could it be possible to use different versions of elements on some form of encapsulated basis (f.e. using Shadow DOM)?


#15

If SVG2 allows custom CSS properties as attributes (which it probably will, all CSS it uses are allowed and custom properties are highly desired), the double-hyphen might work naturally without any more speccing involved. <div --foo="bar">, essentially.

Not quite as nice as a single underscore or hyphen, but it would be more consistent.


#16

That’d probably be good. Still, there is a major difference between HTML and CSS as for custom things:

HTML allows any attributes by default while CSS does not.

It does not generally make sense to (continue to) disallow something that already works in all browsers while being totally future-proof and forward-compatible (standard attributes will certainly never start with underscore or hyphen).

Moreover, we’ve already seen what happens with formally valid, but redundantly long prefixes (namely data-) — they are just ignored (e. g. in case of Angular’s ng-). Two-character prefix is two-times longer than a single-character one while not adding anything useful compared with single-char one.


#17

Some mucking around with an XML well-formedness validator indicates only leading underscores work in XML. I know that doesn’t sound like something to care about, but the HTML5 parsing algorithm to this day takes some pains to keep a reasonable XML serialization.


#18

I think this is an important point. Currently HTML5 features that prevent a document also being well formed XML document are alternatives to ones that preserve XML well formedness (eg not requiring closing tags on many elements). A spec that explicitly encourages breaking XML compatibility, rather than just allowing it, should be avoided.


#19

So the underscore prefix is now even more preferred. Well, the less options we have, the easier is their speccing, and the faster we can get rid of the pure formal limitation.

A description of underscore-prefixed attributes could be added to the existing section about data-* attributes in the HTML spec like as follows:

Custom data attribute may also use the underscore character (_) as a prefix instead of data-. Unlike data- prefixed attributes, underscore-prefixed attributes are not reflected in the string map available via the dataset property of the element.

Example:

 <div _foo="Hello"></div>

#20

Requiring underscore wouldn’t solve the initial problem though – the reality of frameworks that have already defined their own approach. Angular is very popular, Angular 2 is going to be at least somewhat popular.

I would be surprised if they can be convinced to switch to leading _ at this stage.

Given that - is valid in XML attributes, requiring it or any other non-alpha character would not require breaking XML compatibility, but would support frameworks that already break it (e.g. Angular 2).