Exposing a user's input modality


I am more than fine with input-modality - tbh we’ve used that in discussion and even the title of the post, and — I think even in the text itself at one point. Honestly, I think we were just trying to save characters in actually expressing it but it doesn’t matter to me if we think it’s worth the extra 6 chars to be clear about it.


My only concern with input-modality is that it sounds to me like it’s input in the sense of <input> (i.e. data input), whereas the concept we’re shooting for is more like “interaction”. I do agree that modality alone is vague, though.


One concern I have with the current proposal, is that it appears to be relying on the premise that the focus ring will be prevented be default. As per all your examples using the primary :focus { outline: none; }.

Unless you go to the expensive way to polyfill the modality MQ, that would seem to be lacking backward compatibility with older browsers stuck with only understanding :focus { outline: none; }?

I am thinking that perhaps a ‘non-keyboard’ input modality approach might be better. At least at first. So that it doesn’t further promote preventing the focus ring globally by default, as an idea.

Should design flaws, implementation issues or what not arise, it somehow feels like a reverse solution in that. Allowing a global :focus { outline: none; } is one of the problem we are trying to avoid…


Yeah, didn’t think of that aspect. We could further bikeshed this later on :smile:


Good point. I wonder if we should enable both keyboard and pointer modalities.

@briankardell and @aboxhall: Do you have any other use cases in mind other than the one outlined in the article? (which AFAICT either keyboard or pointer can address, and pointer seems to be better in terms of progressive enhancement)


One important issue as @patrick_h_lauke mentioned, is that a touch explore or sequential focus doesn’t really fall in the pointer or keyboard context. This problem is ever more exacerbated by the fact that none of the Assistive Technology follow the same touchscreen event sequence at the moment. It’s nearly impossible to be dealt with because of that.

On “touch explore”, some browser like Firefox fire :hover but not :focus, some fire :focus, some do neither… And on top of that, they create their own fake outline outside of CSS presentation scope. Honestly as a dev, I sometimes don’t know where my accessibility responsibility stands, and wether I am supposed to show a focus ring or not. Without a predictable environment and a properly defined sequence of events with guarantees that it won’t change. It’s a crapshoot. And a double focus ring prospect looks terrible and redundant.

Just a quick late night brain dump, but perhaps what we need in addition here: is a way to create a bridge of understanding between the forced outlines that AT or the device environment applies, and our CSS :focus styling.

In other words, if could have pseudo and/or MQs modalities such as assisted-focus or hover-intent where we could have semantics like:

button:focus:not(:assisted-focus){} /* I am handling the focus ring */


button:focus:assisted-focus { outline: none; } /* no double focus outlines */

and a less input-centric modality approach like:

@media (modality: hover-intent) {

This could perhaps offer better tooling to deal with both touch simulated :hover(s) and :focus.


(note, that was actually @hexalys, not @yoavweiss, it’s just discourse quote weirdness I can’t seem to fix)


  1. I don’t feel like it’s actually “expensive”, our polyfill is all of 2.5k before minification or gzip… Its all of 69 lines and like 10 of that is whitespace :wink: If that’s expensive, it’s a pretty high bar for calling something inexpensive. The CSS WG is also putting together custom MQs which would make it smaller still if that beats us to being in widespread deployment.
  2. Currently if you don’t include the prolyfill, it works as it always has - if you do it’s specifically because you want to handle things better. Note the prollyfill does a :not() on keyboard here, not just a blind rule, just like in the article. If natively implemented, I would expect UA sheets to add the appropriate MQs to do the right things by default (they already do this, it’s just not via MQ) and then the point seems to be moot.
  3. any author styles can already do (and frequently do) :focus { outline: none; } - at least now they can use an MQ to say “keyboard” or “not keyboard” and old browsers would ignore them.

In other worse, I don’t really think this is an issue - it feels like a red herring.


I agree @patrick_h_lauke had interesting thoughts about whether we could abstract it away from being keyboard, I’m hoping he shares them here. I’m not entirely sure that I agree or disagree - in fact - I kind of wish we had something like attribute selector partial matching here so we could go from less to more specific or something. I do think there is value in treating similar things similarly at some level, and I also think there is value in exposing as much as we can so that the community can actually help figure out how to adapt as new things arise.

Specifically with regard to assisted-focus and hover-intent I definitely don’t understand their meaning/proposal… Explain?


I agree on point 1. Sorry if I wasn’t specific enough. What I mean by “expensive” would be a true polyfill (in the likes of an ajax call re-parsing CSS). I wasn’t applying that argument to the prollyfill. Perhaps because I am not particularly in love with the approach and the use of a custom non-valid attribute…

I guess, I am just arguing in favor of a principle that leave the default focus alone by default, and address the cases where I need to remove the focus ring on a case per case. Because once you’d do that for modality: keyboard, it becomes part of your fundamental CSS approach for everything. I don’t really like that. But it’s not a major concern. I can always come up with an alternative prollyfill reversing that approach.

PS: I’ll definitely explain the assisted-focus and hover-intent suggestion in depth, when I get a chance in a few days or within a week’s time. I feel I am on a good track with the idea, but it’s going to take very long description to explain why it’s needed, what it solves and how it would work. I need to sit on it for a bit, and think about half a dozen use cases to see if that concept can hold up.


We did this with lower fidelity specifically for a few reasons: 1) we’re not far enough along to seriously assume that what we have here could make the jump from prollyfill (speculative) to polyfill without feedback and changes 2) We can illustrate the usefulness in a practical way with a very small amount of code and allow developers to actually use it in production apps - this is really important if we want participation and feedback. 3) upcoming developments of custom mq’s would make high fidelity of whatever design is ultimately settled on similarly small.

With regard to the non-valid attribute, would it help if we made it a data- attribute? I believe we actually did this at one point in the evolution, I’m not entirely sure why we didn’t at least dasherize it, which would at least serve the same purpose and virtually guarantee it was safe. That’s perfectly valid feedback for the prollyfill I think, though you and I appear to be some of the only ones actually worried about this - it’s a pretty common thing actually. If people think that is an issue we should change, I’m happy to adapt it - also, it’s a github project so you can open issues or send simple technically minded pulls about the prollyfill itself.


I’ve done a similar “experiment” called focus-source for pretty much the reasons outlined in the article - by which I’m agreeing that there is a real need for this.

If I understood this correctly, the modality (or input-modality or interaction-modality) MediaQuery is supposed to change immediately whenever I change my mode of input (say, type something with my keyboard “keyboard”, then click something with my mouse “pointer”). That means that a focus-style would not be “sticky” and that may lead to confusing UI.

In my focus-source experiment I identified 3 modalities (I called it “interaction-type”): keyboard, pointer and script. The reason for “pointer” was - as has been pointed out by hexalys - that :focus needs to remain the default style for backward compatibility reasons. My applications usually focus a container/wrapper to act as a sequential focus navigation starting point, so I added “script” to allow the CSS to identify “focus set by application”.

Hexalys hinted at assisted-focus to provide the information if some non-document-accessible styling was being applied by an AT. I think that could/should be what :-moz-focusring is about.

I don’t like the MediaQuery idea very much. I’d have preferred a simple pseudo-class :keyboard-focus (that does not change while an element has focus). But the MQ would play nicely with :focus-within, where a simple pseudo-class would not.

I’m not very fond of the idea to define “keyboard-relevant input elements” (as the polyfill shows) - considering custom elements. The supports-modality="keyboard" attribute is technically not necessary, as anyone could achieve the same effect with current CSS functionality. Also using an attribute like that is limiting to single values - what happens when this proposal is extended to support voice input? Do we want to foster multi-value attributes like supports-modality="keyboard voice"?

I had trouble following the proposal and discussion because I was centered altering :focus styles, rather than “generically dealing with input modes”. I think the former - influencing focus styles - is a problem with a simple enough solution. Dealing with “interaction modality” in terms of, say, a FPS (First Person Shooter), is quite a different beast. Is there only one mode, or can keyboard and pointer be used simultaneously? is the mode constantly switching back and forth between keyboard and pointer if I use the mouse to aim, but the space-key to shoot? Is this even relevant to 95% of web sites/applications? I have all sorts of problems wrapping my head around this.


I actually think the focus ring is the special case in a primarily pointer-driven UI - however, as others have pointed out, it’s dangerous to assume that it only applies to keyboards, since other devices (e.g. a D-pad) also need a concept of focus.


Have another look at the proposal and maybe the prollyfill - play with the demo: Modality is determined algorithmically - currently the proposal only spells out the one for keyboard (and, logically “not keyboard”) but effectively it is a based on what just happened and what you are very very likely do to next because of where it happened. For example - if you even if you click on an <input> the modality becomes keyboard because the only way to interact with this is by sending keys to it.


Perhaps the modalities should be “pointer” and “focus” (with the contrast being that “pointer” modality does not use an element-centric “focus” model)? Or perhaps “focus: element” and “focus: ambient” (the latter referring to pointer-like interaction models where focus is not strongly tracked)?

I think this is actually a good point: this topic should primarily be about focus modality, since that’s the actual use case it’s trying to speak to - bigger ideas like general input modality are a confusion of concerns, and liable to gear-up problems in short-sighted assumptions. (Example: look at what happened when the iPhone came out, and every site that saw a “mobile” UA assumed “mobile” meant “low capability”, so the iPhone had to hide its mobile status as much as possible, outright ignoring styles that had been defined for “mobile” modalities.)

Also, this opens the door (at least in terms of discussion) for more granular notions of focus for input modalities that are better at gauging intent, like eye/hand tracking.

Additional use cases for InputDeviceCapabilities

Also one reason leading me to the conclusion that the focus ring isn’t the special case, is consistency. Because we can’t technically reproduce the default focus ring for sure. Even if the color is faithfully reproduced via -webkit-focus-ring-color or with browser specific rules. That browser or user default may change.

A custom focus ring, styled possibly differently on every site, especially for non input elements, sounds like a very inconsistent and possibly annoying experience for “keyboard” only users. But I am open to the contrary with an accessibility user study or more data on this…


I think this is an interesting direction, although potentially focus isn’t the best name since it’d be easy to confuse it with the existing meaning of :focus (imagine a discussion talking about the :focus pseudo-selector in the context of the focus modality - it all gets a little Who’s On First).

We actually discussed the notion of simply not matching :focus unless we gauged (via the mechanisms discussed in the article) that it was likely to be useful; however, we felt that this would unfortunately potentially break things. (This would honestly still be my ideal world scenario, I think.)


Felt, or saw? I honestly can’t think of anything I could point to that this would break. Maybe experiment with it, with something like a browser extension (akin to what @paul_irish is proposing in this lazyweb-requests issue for page caching)?


I don’t think treating focus styling as the special case negates this. In fact, one of the goals which led us to this proposal was to remove incentives for removing the default styling in the first place. The idea would be that the browser (via the UA stylesheet) would apply the focus ring only when it will be useful.


Could be an option. The case we were worried about breaking is where someone detects the focused element by using matches(':focus') or similar (although I can’t seem to make that work…)


anything imperative that uses selectors here is prone to unusable breakage in fact - authors may currently want to find element #x if it has focus or something in an event handler or whatever and take different code paths. Changing the results of :focus from whatever interoperable terms they mean today causes me concern.