[Proposal] video.requestAnimationFrame()

https://github.com/dalecurtis/video-animation-frame/blob/master/explainer.md

Summary

Today <video> elements have no means by which to signal when a video frame has been presented for composition nor any means to provide metadata about that frame.

We propose a new HTMLVideoElement.requestAnimationFrame() method with an associated VideoFrameRequestCallback to allow web authors to identify when and which frame has been presented for composition.

Example

  let video = document.createElement('video');
  let canvas = document.createElement('canvas');
  let canvasContext = canvas.getContext('2d');

  let frameInfoCallback = (_, metadata) => {
    console.log(
      `Presented frame ${metadata.presentationTimestamp}s ` +
      `(${metadata.width}x${metadata.height}) at ` +
      `${metadata.presentationTime}ms for display at ` +
      `${expectedPresentationTime}ms`);

    canvasContext.drawImage(video, 0, 0, metadata.width, metadata.height);
    video.requestAnimationFrame(frameInfoCallback);
  };

  video.requestAnimationFrame(frameInfoCallback);
  video.src = 'foo.mp4';

Output:

Presented frame 0s (1280x720) at 1000ms for display at 1016ms.
9 Likes

Would be great to have this functionality!

Any thoughts on generalizing the AnimationFrameProvider interface so it can support this usage?

https://html.spec.whatwg.org/multipage/imagebitmap-and-animations.html#animationframeprovider

Thanks Ken! I thought I had incorporated that feedback since you last mentioned it; hence the explainer says:

// Extends the AnimationFrameProvider mixin with the addition of an
// VideoFrameMetadata parameter on the FrameRequestCallback.

Can you elaborate on what you’re asking for? I must have misunderstood.

Ah, sorry, I hadn’t remembered to revisit the explainer.

We should get feedback from @fserb who added AnimationFrameProvider to the spec. Not all users of AnimationFrameProvider need the VideoFrameMetadata parameter, so maybe it makes sense for HTMLVideoElement’s requestAnimationFrame to be different from AnimationFrameProvider’s.

I think that this would be very useful to have.

Excellent proposal. I can see this becoming very handy for improving diagnostics like gitch detection. Support it fully.

Intent makes perfect sense and would be very valuable! Why is this API different from ontimeupdate? Is there a reason why the DOM event system is not sufficient? I’d expect this to be named something like onframeupdate which fires on every frame painted. Is this a question of timing? Can this question be addressed in the explainer?

Good questions… probably want clarity on these questions before proceeding.

I’ve been using the Chrome prototype to measure/control the latency of an application using MSE and have found it very useful.

Will video.requestAnimationFrame(frameInfoCallback) gurantee that every frame of an input video file or MediaStream is captured?

Sorry for the delay, I’ve been out of office. This doesn’t use an event that would always be fired since it’s expensive to preserve the video frame for the callback. E.g., holding onto a frame from a hardware decoder will slow down decoding since it can’t always issue more frames until one is returned. So the API is designed to only function when the page is actively interested in this capability.

What we could do is provide an event that does not provide the WebGL frame guarantees. I.e., you would not be able to upload the exact frame to canvas or WebGL based on the event. You would only be able to get the frame properties. This may be enough for most users.

No, you’ll only get the frames for which you’ve managed to call requestAnimationFrame for in time. As mentioned in my other post, this is due to hardware decoder limitations.

This looks very cool!

Can the explainer elaborate on the execution model? Is video.rAF called once per:

  • Decoded frame?
  • Composited frame?
  • Or compositor cycle (vsync)?

Also the example suggests that the callback is invoked before video.play(). Is that intentional (i.e. can it be invoked before frames are presented)?

The video.rAF callback would be called once per composited frame. I.e., every unique composited frame will receive a callback. It is intentional that this can happen before play() if the first frame is composited before playback (For Chromium this is always for preload=auto elements, and upon visibility for preload=metadata elements).