Today <video> elements have no means by which to signal when a video frame has been presented for composition nor any means to provide metadata about that frame.
We propose a new HTMLVideoElement.requestAnimationFrame() method with an associated VideoFrameRequestCallback to allow web authors to identify when and which frame has been presented for composition.
Example
let video = document.createElement('video');
let canvas = document.createElement('canvas');
let canvasContext = canvas.getContext('2d');
let frameInfoCallback = (_, metadata) => {
console.log(
`Presented frame ${metadata.presentationTimestamp}s ` +
`(${metadata.width}x${metadata.height}) at ` +
`${metadata.presentationTime}ms for display at ` +
`${expectedPresentationTime}ms`);
canvasContext.drawImage(video, 0, 0, metadata.width, metadata.height);
video.requestAnimationFrame(frameInfoCallback);
};
video.requestAnimationFrame(frameInfoCallback);
video.src = 'foo.mp4';
Output:
Presented frame 0s (1280x720) at 1000ms for display at 1016ms.
Ah, sorry, I hadn’t remembered to revisit the explainer.
We should get feedback from @fserb who added AnimationFrameProvider to the spec. Not all users of AnimationFrameProvider need the VideoFrameMetadata parameter, so maybe it makes sense for HTMLVideoElement’s requestAnimationFrame to be different from AnimationFrameProvider’s.
Intent makes perfect sense and would be very valuable! Why is this API different from ontimeupdate? Is there a reason why the DOM event system is not sufficient? I’d expect this to be named something like onframeupdate which fires on every frame painted. Is this a question of timing? Can this question be addressed in the explainer?
Sorry for the delay, I’ve been out of office. This doesn’t use an event that would always be fired since it’s expensive to preserve the video frame for the callback. E.g., holding onto a frame from a hardware decoder will slow down decoding since it can’t always issue more frames until one is returned. So the API is designed to only function when the page is actively interested in this capability.
What we could do is provide an event that does not provide the WebGL frame guarantees. I.e., you would not be able to upload the exact frame to canvas or WebGL based on the event. You would only be able to get the frame properties. This may be enough for most users.
No, you’ll only get the frames for which you’ve managed to call requestAnimationFrame for in time. As mentioned in my other post, this is due to hardware decoder limitations.
The video.rAF callback would be called once per composited frame. I.e., every unique composited frame will receive a callback. It is intentional that this can happen before play() if the first frame is composited before playback (For Chromium this is always for preload=auto elements, and upon visibility for preload=metadata elements).
FYI, this is now available in Chrome Canary/Dev/Beta, after turning on the Experimental Web Platform Features flag (under chrome://flags, enable-experimental-web-platform-features).
Note: it doesn’t work yet with webRTC, but that should be fixed soon.