[Proposal] A frame-level event logging mechanism for WebRTC


Problem Some projects using WebRTC functionality have indicated that they need to record data on “source to screen” performance - that is, information about how much time it takes between some event (typically frame capture) occurs in “real life” on the media-generating side and the serving of the same event to the user on the media-consuming side.

Approach I’ve sketched out an approach in this repo:

It consists of a mechanism to tell the media engine to note the times certain events happen to a frame, and a way to get these notes back to JS. It’s intended to be predictable and not too resource intensive.

Comments are welcome; guidance to the WICG process as well - this is my first attempt to use this forum for an API proposal.


Some questions about this proposal.

Usually, as an application there isn’t a clear way to determine which frame is the important frame to track. The example had a remote click and measuring the first frame after that occurred. However, these could be completely asynchronous processes (think like a slide transition) and determining the first frame is not really feasible. Is knowing the frame critical or can you get a time window and get information on all frames in the time window? Would this be performant enough to request all frames?


The current spec says to track all the frames when logging is on, and using the frame identity (RTP time stamp) to figure out which local frames and remote frames correspond to each other, and from there go on to figure out which frames are “important”. Exposing more info about frames (for instance the frame size and whether or not it’s a keyframe) could also be valuable.

I’m still unsure what API is the right one for retrieving the collected information - firing events every 50 ms is probably not a good idea. So this is still a bit vague in the proposal.