Media Controls API

Ok, that is really interesting. It actually works very well even when I switch the screen off and on (assuming I’ve hit the ‘play’ button within the iOS native remote controls interface to initially request persistent playback though).

The iOS implementation and what is proposed at [HTMLMEDIAFOCUS] are actually very similar. iOS applies media focus logic to all HTMLMediaElement objects implicitly. If another HTMLMediaElement object begins playing out the currently playing HTMLMediaElement object is automatically paused.

That approach comes at the expense of allowing more than one or element to play out at any given time since only one object can hold media focus and be subject to remote control events at a time. There may be cases where web app ‘pings’ (e.g. ‘You received a new incoming email!’) should not pause the media and take over the remote controls though hence the opt-in proposal in [HTMLMEDIAFOCUS].

Perhaps we want to enforce this iOS behavior everywhere (i.e. on Desktop browsers too). Then we don’t really need any new API surface. Right now in Desktop browsers two or more media elements can play out at the same time causing a number of issues in directing the remote control key events toward the correct place. That’s why we proposed the opt-in ‘remotecontrols’ HTMLMediaElement content attribute to let web pages adopt the exact same behavior that is currently enforced on iOS devices.

In [HTMLMEDIAFOCUS] we propose firing appropriate events toward a focused HTMLMediaElement object when ‘Previous’ and ‘Next’ buttons are pressed in remote control interfaces. What those events would be called is negotiable (e.g. ‘next’/‘previous’, ‘seekToEnd’/‘seekToStart’, etc).

iOS keeps a document alive if one of its child HTMLMediaElement objects currently has media focus (and hence, remote control events access). Both the iOS approach and [HTMLMEDIAFOCUS] assume that only one media element plays out at any given time, hence the overhead here would be to keep only that one active document alive.

Do you think that is exactly the set of problems worth solving? Notably, it doesn’t say anything about previous/next keys or custom notification / lock screen UI.

I meant in general. I’ll start formalizing the use cases + reqs for standardization. We should start migrating stuff to:

I’m assigned to work on this for probably the next year as it’s high-priority for us.

May I ask why W3C and not WHATWG? I like green stylesheets and CC0.

I’m easy. I also like CC0 and green stylesheets + I like custom green logos :).

Thumbs up! I want a WHATWG spec of my own so I can make a green logo too :slight_smile:

Have you considered how webrtc fits into this picture? For example, when you’re waiting for a call in one tab, but have another tab open or another application, chrome currently doesn’t display the remote video until you focus on the tab - however, your local video is sent as soon as the connection is established. This can lead to the remote end watching you before you are aware of them. You can test this with - it’s quite creepy. It would be good if media focus can be controlled in a hidden tab.