Proposal: Compression Streams standard

Compression APIs are common in other languages, but not built-in to JavaScript in the browser. This is odd, as all browsers need an implementation of the “gzip” and “deflate” compression algorithms in order to implement HTTP, PNG and many other web standards.

My proposal is to first expose these two ubiquitous algorithms to JavaScript authors, but design the API so that it can be extended to new algorithms in future.

Since compression and decompression are streaming processes, they are a natural match for WHATWG Streams. They match exactly the model of a transform stream. So this is what I have based my design around.

Simple usage example:

const compressedReadableStream = inputReadableStream.pipeThrough(new CompressionStream('gzip'));

There are more examples in the explainer.

There were enthusiastic responses to the Blink Intent to Implement thread and I have seen a lot of support this week at TPAC 2019.

The most pressing use case is compressed uploads, but there are many other exciting possibilities.

Currently apps which have a keen need for compression, such as analytics, include a compression library such as pako with their code. Having the facility built-in to the platform would avoid the extra download.

I would like to explore incubating a standard document with the WICG.

3 Likes

This was discussed at the WebPerfWG F2F meeting at TPAC, with many folks enthusiastic about the use cases that this will enable! Would be good to capture those use-cases in the explainer (and would also be good if folks here would chime in stating use-cases we missed)

1 Like

As described at OP

and

Is it possible, in javascript, to have multiple download urls sent into one zip file and that zip file can be downloaded. So pretty much, on my web page, there is one button, that when clicked downloads a zip file of all the files from the download urls compressed into the zip?

I believe I’d need to use jszip or some tool like that. Is this at all possible and is there any advice on where to start?

Multiple download links to one zip file before download javascript are reasons to standardize the procedure.

At least one other use case is compressing text, audio, images (media) to stream compressed data from the browser capable of being decompressed and streamed (read, read/write) as media segments, plain text, captions, subtitles, animations, speech at a different browsing context and browser.

Thank you for your response.

I believe I’d need to use jszip or some tool like that. Is this at all possible and is there any advice on where to start?

Yes, currently using jszip or a similar library is the right way to do it. Once CompressionStream has had support for “deflate-raw” added (see issue #8 on the explainer, it will be possible to create zip files with a much smaller library.

The top-voted response to that stack overflow post (https://stackoverflow.com/a/37176830/2523224) seems like a good place to start. As noted in the comments, that won’t work for cross-origin requests where the server hasn’t permitted access, due to the same-origin policy. In that case, a server-side solution would be the only answer.

At least one other use case is compressing text, audio, images (media) to stream compressed data from the browser capable of being decompressed and streamed (read, read/write) as media segments, plain text, captions, subtitles, animations, speech at a different browsing context and browser.

Thank you for suggesting use cases! I plan to add more use cases to the explainer soon.

There is now a draft spec at https://ricea.github.io/compression. Please take a look and tell me what you think.

I updated the explainer with some use cases. Discourse won’t let me link to them, so you’ll need to follow the link in my original post. The list is incomplete so I would appreciate contributions.

It would be good to keep track of requirements for each use case. For example,

  • decoding zip files requires “deflate-raw” support
  • in-memory databases would benefit from a high-performance algorithm such as LZ4.

But I don’t know what would be a good place for tracking requirements.

Is there any reason https://github.com/ricea/compressstream-explainer/blob/master/README.md is not linked to from https://github.com/ricea/compression-streams/?

No good reason. I have slightly improved the README.md at https://github.com/ricea/compression-streams/. Please take look.

I’m also excited about standardizing this. Compression is an important primitive for myriad features of the web platform, and it seems very redundant to force developers to polyfill this unexposed platform capability using JavaScript at a performance cost (both download and runtime operation).

I’m particularly interested in raw-deflate support (many file formats are based around ZIP files) and in brotli support (for performance/size/polyfill-availability reasons).

My personal use case for raw-DEFLATE would be in building a web-based viewer for Fiddler’s SAZ (Session Archive ZIP) traffic capture files… it would be great to be able to let users view the contents of these files without requiring a native client application, and especially useful on platforms (ChromeOS, MacOS) where no good client exists.

I would like to transfer the standard repository at https://github.com/ricea/compression-streams/ to WICG.

The repo now lives on https://github.com/WICG/compression

Yay!!

1 Like

I believe exposing data compression to JS authors is a great idea!

But I’m trying to understand why the compression methods are hard-coded?

There is already interest for more flexible compression listed on GitHub: deflate-raw: parameters & flushing, custom dictionaries. & Brotli. & I do not see any browsers that are not Brotli compatible that will implement Compression Streams.

Yes, it will take slightly longer to implement, but please ether add a hard statement that this is only a draft implementation, & there will be a more flexible API before v1.0 (eg get browsers to add a beta behind a flag), or just add the flexibility to the standard now.

But I’m trying to understand why the compression methods are hard-coded?

Do you mean why is it not extensible via JavaScript? That will be added at some point, see https://github.com/WICG/compression/issues/9.

Or do you mean why is the list of built-in methods specified in the standard? This is to ensure predictability, ie. that authors can rely on the same set of methods regardless of the browser environment.

I do not see any browsers that are not Brotli compatible that will implement Compression Streams.

Chrome includes the dictionary for Brotli decompression, but not for compression. The dictionary for compression is 190KB and it’s not clear that it would get sufficient usage to be worth putting that on everyone’s phones.

Yes, it will take slightly longer to implement, but please ether add a hard statement that this is only a draft implementation

As long as it is in WICG it will say “UNOFFICIAL DRAFT” across the top.

there will be a more flexible API before v1.0

This is a living standard. You can expect more features to be added in future. However, it is perfectly implementable as-is.