Related discussion: crbug.com/1018022
Images account for ~50% of transferred bytes and remain one the biggest optimization problems and opportunities on the web. The reasons why this problem persists are numerous and complex. An incomplete list and in no particular order…
- Most users who upload media are not tech savvy enough to know about formats, file sizes, etc (nor should they be!). The natural flow is to pick the file from a media picker and hit upload.
- The torso and long tail of the web are not setup to bear the cost of image optimization: the economics of the long tail are to host the most number of sites and assets at lowest cost; format optimization, resizing, etc., is CPU+storage costly and is omitted.
- Image optimization is hard™ and setting up CDNs or open source services requires awareness, technical know-how, and often a credit card.
We (the webperf community and web developers at large) have been beating our heads against this problem for over a decade but the “solution at scale” remains an unsolved challenge.
I will suggest that the only way to make progress in this space is to ensure that media leaving the device is optimized against criteria specified by the upload target. That is, the optimization should happen on the users device, before it is uploaded, and according to criteria specified by the service or site that is initiating the upload. Such common criteria are: accepted file type, file size, aspect ratio or max width/height, or duration for video/animated images.
Prompts like the above are common on the web and are a terrible user experience. As the user faced with such dialogue: how do I resize the image to fit, how do I reduce the file size, how do I crop?
The browser can fix both the technical and cost problem faced by site owners, as well as significantly improve the user experience (latency, cost) for the user who is initiating the upload.
A smarter (image) file input ~ MVP
Note: all names below are examples, subject to naming bikeshed, etc.
<input type="file" accept="image/*">
We already support some control over the inputs (via `accept`) to file upload. What is missing are the output controls, which the site owner could specify to instruct the browser on how it should assist the user. Such outputs could be…
Output filetype
<input type="file"
accept="image/*"
output="image/jpeg">
Transcoding files on the server incurs the cost of potentially unoptimized upload for the user, as well as CPU transcoding costs for the server. In many cases the site knows the exact format it needs to receive, and the browser should be able to transcode it before it is uploaded.
Maximum upload size
<input type="file"
accept="image/*"
output="image/jpeg"
outputMaxSize="1MB">
The browser should automatically re-encode the image on behalf of the user, with the best quality it can, against the specified limit. For example, if the user picks a 32MB image from their mobile gallery, it shouldn’t throw an error but do the work on behalf of the user to meet the page specified criteria.
Dimension and aspect ratio constraints
<input type="file"
accept="image/*"
output="image/jpeg"
outputDimensions="1:1"> // or “100x100px”, “1024px”...
The browser should automatically resize the image against specified width or height requirement on behalf of the user. If the aspect ratio does not match the input image, it should have a simple UI that allows them to figure out what to crop. For bonus points, the browser can also provide smart crop previews to assist the user; apply and demonstrate some of that ML magic we keep hearing about!
A smarter (video) file input ~ beyond MVP
(Note: I would suggest we start by exploring image oriented use cases first, but we should keep in mind that video has similar (and even more acute and amplified) challenges for sites and users)
Many of our mobile devices are now shooting at 4K resolution, which clocks in at 10Mbps. Such uploads are costly for the user, prone to fail due to large filesizes, expensive to store and extremely expensive to manipulate on the server, and are rarely what is desired to be served to the user — for these reasons, free video optimization CDN’s (modulo, YouTube) are not a thing on the web.
The browser is in a position to help both the user and site owner: it can downsample the video prior to upload, convert to alternate format if necessary, and provide a UI for the user to trim prior to upload, e.g…
<input type="file"
accept="image/*"
output="image/mp4"
maxLength="10s"
outputSize="10MB">
- Provide a UI to trim video to specified length prior to upload
- Enable the site to provide and enforce max filesize, accepted format.
- …
FAQ
Doesn’t canvas already allow me to resize, re-encode? Also, WASM?
In theory, yes, some of this is possible today (squoosh.app is an example) and proposed capabilities like WebCodecs might unlock even more powerful use cases in the future. However, the fact that something is or may be possible in the future does not automatically mean that it will be adopted at scale and by all sites.
In practice, implementing resizing, size optimization, cropping, etc., are hard technical problems. Case in point, it’s been possible to resize images via canvas for a long time, but that’s not a common or widely used best practice. The browser can and needs to own this problem if we want to see change at scale. At the same time, for those that want to pop the hood and implement own variant: great, we’ve got APIs for you!
How have others solved it today?
This problem space is hard. In fact, entire companies have been built around it: UploadCare, FileStack, etc. What’s described in this doc is a (small) subset of the services they offer, but a critical subset that should not require a credit card or technical knowledge to integrate.
There are free image CDNs, doesn’t that solve the problem?
No, CDNs do not solve all the problems.
At upload time, the important criteria is that the media is optimized locally and before it is sent to the upload (potentially, CDN) server: the user must be able to perform basic editing operations like crop, video trim, rotate, etc, locally and with low latency; the file must (re)encoded prior to upload to maximize likelihood of upload success and minimize data cost for the user, as well as processing and storage costs for the server. Further a CDN should not be a requirement for delivering a reasonable user experience on the web — perhaps a recommended one, but not required…
At serving time, yes, CDNs are and will remain an important best practice, as they can perform further device specific optimizations (e.g. re-encode and serve different formats for various browsers), apply further customizations and transforms, provide media management, etc.