Media timed events API for MPEG DASH MPD and emsg events


#1

There is a need in the media industry for an API to support metadata events synchronized to audio or video media, specifically for both out-of-band event streams and in-band discrete events (e.g., MPD -carriage and emsg events in MPEG DASH: http://standards.iso.org/ittf/PubliclyAvailableStandards/c065274_ISO_IEC_23009-1_2014.zip).

These media timed events can be used to support use cases such as ad insertion or presentation of supplemental content alongside the audio or video.

On resource constrained devices such as smart TVs and streaming sticks, parsing media segments to extract event information leads to a significant performance penalty, which can have an impact on UI rendering updates if this is done on the UI thread. There can also be an impact on the battery life of mobile devices. Given that the media segments will be parsed anyway by the user agent, parsing in JavaScript is an expensive overhead that could be avoided.

The DataCue API has been previously discussed as a means to deliver in-band event data to Web applications, but this is not implemented in mainstream browser engines and is therefore not reflected in the WHATWG living HTML specification. There is previous discussion here, and an earlier liaison statement from HbbTV here. Whether DataCue should be taken up again or another API should be developed, we believe there is a recognized need for extending the existing HTML specification to assist web applications in being able to properly process and render media-timed events.

The Media & Entertainment Interest Group has a draft use case and requirements document here. Please note that this document describes a broader range of topics, and we only propose to work on MPEG DASH MPD and emsg events at WICG at this stage.

We ask for your support in progressing this, and look forward to working with you.


#2

The DataCue API has been previously discussed as a means to deliver in-band event data to Web applications, but this is not implemented in mainstream browser engines and is therefore not reflected in the WHATWG living HTML specification.

WebKit supports DataCue.

The original interface was extended with two attributes to support non-text metadata, type and value:

 interface DataCue : TextTrackCue {
    attribute ArrayBuffer data; // Always empty

    // Proposed extensions.
    attribute any value;
    readonly attribute DOMString type;
};

https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/html/track/DataCue.idl

type: A string identifying the type of metadata:

"com.apple.quicktime.udta" - QuickTime User Data
"com.apple.quicktime.mdta" - QuickTime Metadata
"com.apple.itunes" - iTunes metadata
"org.mp4ra" - MPEG-4 metadata
"org.id3" - ID3 metadata

value: An object with the metadata item key, data, and optionally a locale:

value = {
    key: String
    data: String | Number | Array | ArrayBuffer | Object
    locale: String
}

This simple WebKit layout tests loads various types of ID3 metadata from an HLS stream.

For more information, see this session from WWDC 2014.