Proposal for an API that let create an image from an HTML element.
[ImageData img] = window.screenshot([HTMLElement element])
the API should be able to create images only for elements owned by the document (maybe using same origin policy) and - if necessary - by asking the user for a screenshot permission.
this API can solve the problem of export part of document as image as the user see in the browser, as developer I was asked to do this many times in my life (eg: export a chart, a table, etc…), and the only current solution is to create a specific print css (but I don’t want print anything, I want download/save an image, and this also confuse the user), or use solutions like html2canvas (with all their pitfalls).
As chaals noted in the GitHub issue, this raises some privacy and security issues, so this definitely needs to be guarded by security measures.
Regarding this, what should happen with iframes within the document, which don’t fall under the same origin policy? What if the user wants to black parts of the screenshot (for privacy reasons) before the data is made available through the API.
Besides the security implications, I think the currently suggested API is pretty restricted in what it can do. In my opinion, it should also allow to make a screenshot of an arbitrary rect of the document (making full-page screenshots possible) and provide options to define the output format and other options similar to the
So, I’d imagine something like this:
ImageData = window.screenshot(HTMLElement element or DOMRect rect, optional DOMString type, optional any quality)
Or two separate functions for elements and rects. The
toBlob() method of the
HTMLCanvasElement interface could be moved to the
HTMLElement interface and another
toBlob() method could be added allowing to take screenshots of rects:
void window.toBlob(BlobCallback callback, DOMRect rect, optional DOMString type, optional any quality)
I think this would be a good feature (if the privacy details can be worked out), but I think it would make sense to spec a video version at the same time (i.e. a MediaStream). E.g. getting a video of the current tab to send over WebRTC, recording activity on a page with MediaRecorder, etc.
In fact one way of doing this would be to only spec a video feature, and reply on ImageCapture.grabFrame() for still images: https://developer.mozilla.org/en-US/docs/Web/API/ImageCapture
Could you clarify the use case? Why would the user need a screenshot of a chart or table? Isn’t a text format preferable?
Text formats e.g. CSV will ignore table styles, merged cells, etc.
But an HTML document with inlined CSS should be capable of achieving any design. I guess my question is, what does the user intend to do with the screenshot? Send it via email? Store it locally? Embed it in documents?
My intuition tells me that a website should encourage text formats (HTML, SVG, etc.) over bitmap images, because the latter don’t provide any form of accessibility.
My use case was to embed a visually rendered version to my word processor.
Otherwise, Google uses in-browser screenshot (by a library) to send feedback.
One good use case is making documentation, guides, tutorials etc. where you want to have screenshots, possibly with additional content drawn on top like circles and arrows to emphasize particular areas. Like @SaschaNaz suggests these can be pasted in to other documents (even HTML pages) to assemble documentation. Being able to do this directly from the page makes it easier to provide a feature for this, or automate it.
JFYI, similar in terms of privacy and security discussion is happening now for
WEBGL_texture_source_iframe extension for WebGL that will allow using a rendered
iframe as a texture in a WebGL app. Given that in WebGL a developer can read data back from GPU, it’ll also allow obtaining image data for an iframe.
Here’s the PR and related issue in Khronos GH:
IIUC, the discussion boils down to a
safe attribute that, if set on an iframe, will properly restrict what’s allowed in the iframe (i.e., turn off
:visited CSS pseudo-class, forbid non-same-origin images, etc).
This sort of primitive is very useful on other platforms for creating efficient “visual” clones of views/elements.
One important use-case would be to transition to a blurred version of an element you would create a snapshot/screenshot of the element, blur the screenshot, and then just animate the opacity of the blurred screenshot.
I would like to add another use case for this API. Exago BI lets users view dashboards and reports. A frequent request we get is to be able to generate a snapshot image from whatever the current view is. The current methods are to either:
- Use server-side rendering with a tool like Puppeteer (more architecture and development burden)
This kind of API would be super valuable for us to give users a quick, stable way of capturing a view of their data and sending it to a colleague or otherwise saving for later.
I would also like to expand the conversation with some further considerations:
Specifically on security:
- How do we control cross-origin scripts’ use of this API? (Our use case at Exago would likely need a script on a different origin to be able to screenshot the page.)
- Would need to work out how permission is granted and potential interoperability with the Permissions API
- I like the “bounding rect” API more than the “Element” API and I think skipping out on the Element version may significantly reduce complexity in design. Should probably state explicitly that no browser chrome (context menus, etc.) are included in the resultant image.
- Should there be an option to include or exclude the mouse cursor?
- We should probably define when in the frame lifecycle the image is collected. Is a valid use case
to use requestAnimationFrame with the screenshot function to capture video of user interaction?
- As was suggested above, the return type and options for image type and quality should be worked out
I’ve bootstrapped an explainer here. I’m happy to add collaborators to the repo if anyone wants to contribute!