A partial archive of discourse.wicg.io as of Saturday February 24, 2024.

User interaction with web apps


The current HTML5 definition of the accesskey attribute is an improvement over the HTML4 definition. There is still room for improvement, and ways accesskey could be extended to be a more useful feature.

Chaals has been giving it some thought, and has put together a proposed replacement for the accesskey definition: http://chaals.github.io/accesskey/index.src.html


I changed the title (from “improving accesskey”), because improving accesskey is the little bit (although it turns out to be a pretty useful little bit to do).

The big problem is that right now the model for users interacting with applications is pretty backward. Web developers, with very little idea what the user can do or the system they can do it on, are expected to know what is a usable interface.

The real result is that we get piles of javascript interfering with the default behaviour of our browsers, or being blocked from interfering - thus leaving applications unworkable.

We need to reconsider who in the ecosystem is best placed to negotiate physical interactions with the user (hint: the term “user agent” does have a meaning), and how to make it easier for

  • Developers to add interactivity to their apps
  • Users to understand what controls are in an app and what they do
  • Browsers (user agents, if you will) to offer users consistent and useful interaction
  • “shortcuts” to be shorter, and not break things a particular user relies on working

One of the problems I’ve noticed is that accesskey is a markup definition of keyboard shortcuts, where most other interactivity is done in javascript.


Yes. But “interactivity is done in javascript” is part of the problem statement… although I realise that while I have a proposed strawman solution here, the problem statement is somewhat scattered.

Interactivity done in Javascript currently provides no clear way for two libraries, two components, an extension (which are important parts of most non-MS browser ecosystems), or even the browser itself, to find out what interactions have been “claimed” - nor what they actually do.

It is based on the premise that the author can choose sensible interactions, which is true in the sense that they know their app and can make a comfortable consistent interface, but probably unchangeably false in the sense that they have very little idea about the user environment, and giving them enough information to do a good job requires a very serious reduction in privacy, to an extent that gets security people concerned.

A major reason for doing interactivity in javascript “today” is that there is no markup method for it - partly based on the observation that early accesskey implementations in browsers were bad enough to be as harmful as using javascript for interactivity.

That situation has changed - accesskey conflicts in browsers are mostly resolved sensibly, whereas javascript interaction conflicts are not. More and more big-name web apps have “untrustworthy” interaction - interaction that hijacks the user’s expectations of how their own user agent works without providing clear notification nor a way to stop it.

It is important that there is a unification between the markup world - which is particularly important to environments where independently built components need to be composed - and the javascript world, which offers programmers a set of advantages.

The javascript side of the equation is to have a standard API for registering interactions, which is powerful enough to allow sensible composition and conflict resolution. We don’t have that yet. It would be possible to build it, for example extending addEventListener, making it possible to get a sense of why those events get caught.

The accesskey attribute value is already reflected in the DOM and therefore easily available for scripting. It also provides 'accessKeyLabel` (albeit implementation is once again hopeless in most browsers) which is a step toward finer-grained javascript control.


I did file an issue on whether to make accessKeyLabel or an equivalent parseable. I will also write a test case or two.


I started writing up some tests to find out what browsers do now.

It’s depressing. Basically, the ones I tested (a bunch of stuff on Mac) gets the DOM accessKey attribute right, although I haven’t tested proper reflection except in Opera 11-12 (which got it right).

Otherwise the browsers implement like it’s 1999 :cry:

They do manage to work with the HTML 4 version of accesskey. Other testing I already did suggests that at least in the case where accesskey-derived shortcuts conflict with screenreaders, the browser lets the screenreader win. Screenreaders are the best at letting users discover there is an accesskey available, although being the best doesn’t imply being actually good. But they are not able to make that shortcut easily useful, which sort of defeats the point.

I still need tests to determine what happens when a webpage tries to hijack an accesskey

No browser I found passed even basic tests to see if they could support the HTML5 processing model, let alone the basic test of whether they did. Firefox implemented something for the DOM accessKeyLabel attribute, but it’s a fragile, non-conforming and not very useful implementation.

On the one hand, this means there are a lot of browser bugs to be filed, and that meanwhile the HTML spec should change to match reality. On the other hand, this means there is plenty of potential for doing something useful still.


The github repo with the spec draft, issues, tests, etc is at http://github.com/chaals/accesskey

I welcome test results, issues or comments on existing ones, spec proposals, etc.

There are some pretty big holes in the proposal at the moment :frowning:


Hey Chaals,

thanks you for the work you put into this!

Interactivity done in Javascript currently provides no clear way for two libraries, two components, an extension (which are important parts of most non-MS browser ecosystems), or even the browser itself, to find out what interactions have been “claimed” - nor what they actually do.

Sounds like we’re (still) waiting for IndieUI to become a thing. This is the only (proposed) infrastructure I know of, that could have the power to cater to all of the above (libraries, components, browser extensions). It would also allow us to (finally) “freely” map keyboard/touch/mouse input to actual application interactions, instead of hotwiring an interaction to a specific input.

Regarding the proposal:

The current HTML specification assumes all shortcuts assigned will be key combinations. This does not meet the reality of deployed platforms, many of which either do not have a keyboard or also permit other activations more appropriate for a shortcut.

The accesskey attribute contains it’s context in the name: key. While your observation is correct, I don’t think accesskey should be the entity to address all of that. Maybe IndieUI is a way to go?

A browser might apply this by adding a voice command “Написать” to a grammar of expected commands for which an event can be fired, by listening for keypresses of the Cyrillic letters “Н” (equivalent to the latin “n”) or “П” (equivalent to latin “P”) or one of the letters proposed by the author, perhaps with a standard modifier.

I’m not a big fan of complex attribute values, I don’t think it’s very intuitive to couple voice input with keyboard input. Also wouldn’t voice input have the same requirement regarding list of words to allow conflict resolution? Also why not simply use the element’s label (text content)?

For gestures, it is useful to present an animation of the gesture. Can we enable this?

I have my problems thinking up a scenario where I’d want global gestures to do something like press a button. And even then, I’d likely go with a custom implementation for nicer visual appearance. Except for maybe “swipe from right edge of screen” to open the off-screen menu, or something like that. I haven’t put much thought into this yet, but it feels weird.

To tell the user what the shortcut key is, this script explicitly adds the browser-described shortcut activation to the button’s label:

Wouldn’t it be simpler if we could do that from CSS? I know accessKeyLabel is a property, not an attribute, but this looks more appealing than mutating the DOM:

button[accesskeylabel]:after {
  content: "(" attr(accesskeylabel) ")";

The accesskey attribute’s value is used by the user agent as a guide for creating a shortcut that activates or focuses the element.

Why activate or focus? Who decides which action to take?

If the user agent has a stored user preference for activation of the element, then skip to the fallback step below.

How would that work?

(This is a fingerprinting vector.)

links to http://chaals.github.io/accesskey/introduction.html#fingerprinting-vector, which does not exist


Hi Rodney,

We are waiting for IndieUI or something of that nature to happen. For example it makes sense to treat links with known rel values the same way, by giving them default browser activation - and indeed some browsers did so 15 years ago, but now it’s only available in extensions as far as I know :angry:

The point of using accesskey - by which I mean any author-initiated shortcut scheme - is for things which are important to a given page/app, but for which there isn’t, and probably won’t be, a standard identification. So I hope that e.g. datepickers become something standard - at the moment they are a horrible unpredictable mess that causes massive accessibility problems everywhere, given how often people need to provide a date.

But the ‘add this selection to my multi-paste clipboard’ function probably won’t be standardised any time soon, even if it is really valuable in a small handful of apps. So that’s a case where the authors of those apps are more likely to request a shortcut of some kind. My goal is that they can do that without trampling on other parts of the User Agent’s interface by accident.


[quote=“rodneyrehm, post:8, topic:1177”]

The idea is to stop the endless procession of attributes or scripts every time someone invents a new interface technology. Instead, the author requests a shortcut, and suggests something that makes sense.

Sadly, existing browser implementations don’t seem to do anything useful if the author tried to use the HTML feature of allowing more than a single suggestion. And in any event, it seems unlikely browsers can actually know what input the user has available.


I just updated the proposal - new algorithm that tries to match both current reality and the fact that browsers don’t, a priori have a way to determine e.g. what keys the user’s keyboard has…


A browser might store a preference for activating a particular type of element, such as <a rel="contents"… (this sort of thing was possible with Opera Presto, and pretty easy with extensions).

Yeah. Since I changed that part of the algorithm - in part because it was unimplemented, in part probably unimplementable - I think the fingerprinting vector has gone away. Although I’m not certain, and it needs some checking.

Yes. But as far as I know we can’t. I am not sure how easy it is to reflect DOM values into CSS, but given the limited use cases for this and the almost total lack of implementation so far, I’m not sure it is a high priority - unlike browser notification of shortcuts which has similarly rubbish implementation today.

The reason for preserving the name is that it gets used in existing content. It turns out to be a silly name, but that’s hardly the first time we have faced this issue in HTML…


There is an old HTML Bug that carries a bunch of thinking about how this should worl