Custom image/audio/video codec APIs

There has been a spate of image codec innovation lately, including new experimental efforts like:

These formats often show significant quality or compression improvements over prior formats. Excited developers then start petitioning browser vendors to support the latest one, but the browser vendors have to weigh up the technical costs of effectively having to support the format forever, with all the security, maintenance, binary size and other overheads that entails. This makes them reluctant, and makes it very hard to gain cross-browser support for new formats - note how WebP is still only supported in Blink-based engines. The downside of this is it makes it very difficult to take advantage of the benefits of these new technologies on the web.

However I think this could be worked around if developers could write their own codecs. Typically polyfills are ugly and involve doing things like replacing every img tag on the page with a canvas, which affects the functionality of things like CSS selectors and available DOM APIs, and likely has a significant performance overhead too. This pretty much prevents it being used in production. But what if developers could use an API to directly tie in their codec with the browser’s pipeline? Consider:

navigator.registerImageDecoder("image/x-flif", arrayBuffer =>
    let imageData = new ImageData(...);
    /* code to decode a FLIF image from the given arrayBuffer... */
    return imageData;

Some notes about how this might work:

  • This could live on a Service Worker so decoding does not block the main thread of pages using these images.
  • Whenever the browser receives an image of type image/x-flif, it invokes the callback to get an ImageData representing the fully decoded image.
  • Only the registering page’s origin can use the format. This prevents any compatibility problems from having to support an image format for the entire web.
  • This would integrate with the browser’s own image pipeline, allowing FLIF images to be used wherever any other browser-native formats can be used. For example: HTMLImageElement, CSS images, picture element sources, decoding blobs with createImageBitmap, etc.
  • With WebAssembly and SIMD, it seems plausible that the JS decoder could approach the performance of a native one.
  • A real-world solution ought to add other callbacks when metadata is received (so it can return the correct size ASAP), progress updates (for progressive or slowly-downloading images), and possibly other features like animation.
  • If a format using this approach is successful and is widely deployed and proven in practice, browsers could add native support for it with the knowledge it is definitely useful, rather than based on the speculation it will be. Registering a format that is already supported by the browser is simply ignored, preserving compatibility with existing pages while allowing them to take advantage of the benefits of the browser integration (e.g. maximum performance).

This approach could also be extended to encoding images (for canvas’ toDataURL()/toBlob()), and audio and video codecs, so formats like Opus or VP9 could be used across browsers like native formats without having to wait for vendors to decide it’s worth implementing (which in some cases they may never do for political reasons, e.g. support for open codecs in Safari). For a MVP, decoding images is probably the simplest first case to focus on.

While obviously this API itself would need cross-browser support, once in-place it would give web developers the freedom to experiment with the latest bleeding-edge codec technologies, and browser vendors could observe which really work in practice, without the browsers themselves having to take on the burden of permanent support for several new formats.



The same:

I say: good idea.