Use <iframe> contents if srcdoc is present but empty


#1

srcdoc is a much needed attribute, but specifying an entire HTML document inside an attribute is clumsy. On the other hand, I understand that just using the <iframe>’s contents was not feasible due to all the old markup using <iframe> contents as a fallback for non-iframe supporting UAs.

Being able to use srcdoc as a flag alleviates this compat problem, and enables us to use the <iframe> contents as the srcdoc.

I’m not sure if any UAs do weird things when there’s an entire other document inside <iframe> but since they do not process its contents, I suspect it’s probably ok, unless the document inside has iframes, in which case UAs will probably treat the </iframe> closing tag as the closing tag for the outer iframe. However, in these cases, or in cases where this might happen (e.g. dynamically generated content), authors can always put the document inside the srcdoc attribute.


#2

I’m sorry if I’m missing something, what’s the usecase here and why is it a much needed attribute?


#3

Ok, I think I get your point here now, use the content in between iframes like this:

<iframe srcdoc=""><p>Hello how are you</p></iframe>

Would basically use the innerHTML of the iframe as though it’s the value of srcdoc - correct?


#4

I like the idea, and it would make the srcdoc easier than having to put all of the HTML into a single attribute (it can get really big).

Just wondering though, how many browsers don’t use / ignore iframes (probably because they are really old), and considering iframes are used to typically host/isolate potentially more problematic content (e.g. using the sandbox attribute to limit what it can do), you now have this potentially dangerous code running in the main DOM… I usually use srcdoc to contain the dangerous HTML, and the iframe content to put in some safe fallback for those older browsers.

I’m also wondering, browsers have a pre-scanner to look for resources later in the document (e.g. while waiting for some JS to load, they look ahead for any other images/JS to get)… I’ve not looked at the implementation, but I suspect these are very quick/simple, where I’m not sure if they will be cleaver enough to notice that the HTML tags are within an iframe, and might start downloading them (where an iframe might have a CSP that would otherwise block).

That said, I would like iframes to be easier to use - I think they make a great way to improve security on a website (it’s a shame the CSS working group, and the browsers themselves, still haven’t allowed the height of the iframe to change based on the content, without the use of messy JS).


#5

iFrames are often used just to isolate styling of the corresponding page fragment from the rest webpage (e. g. in WYSIWYG editors in CMS admin panels where there is no any security danger).


#6

srcdoc already exists and is implemented everywhere except IE/Edge, that’s not what I’m proposing.


#7

Yes. This avoids the need for escaping quotes and brackets, and frankly, makes much more sense.


#8

I really don’t think there’s any browser in actual use today that doesn’t support <iframe>. It was introduced in HTML4, not exactly a cutting edge spec. Even IE4 supports it, lol.

Good point about the pre-scanner but I doubt it. If they don’t take nesting into account <template> and <script> would have the same problem (<script> is often used for templating as it has better support than <template>). It’s not the first time we’re gonna have inert markup as content of an element.


#9

Your point being?


#10

My comment was a reply to @craig.francis (for some reason, that fact is not always indicated here — probably due to some Discourse bug).

I like your idea, Lea: when the srcdoc attribute has originally been introduced, one of my first thoughts was the same: why it’s the attribute and not just contents of iframe (probably with a boolean attribute [like srcdoc="" in your proposal] instructing browsers to interpret contents correspondigly).


#11

Good point about the pre-scanner but I doubt it. If they don’t take nesting into account <template> and <script> would have the same problem (<script> is often used for templating as it has better support than <template>). It’s not the first time we’re gonna have inert markup as content of an element."

Yup. Not a problem for preloaders to get around that and simply ignore tokens inside iframes. (as you say, very similar to <template>)


#12

As far as I know attributes can contain newlines, etc. So, for example, this works:

<iframe srcdoc="<!DOCTYPE html>
<p>Hello, world!</p>
"></iframe>

So is the particular difficulty here not being able to use quotes in that content? Is there something else you find clumsy about this?

Maybe attributes should be able to be delimited by more complex markers like Python’s triple-quoted strings. That may be useful for other attributes with long values which mix syntax (for example inline JavaScript which contains strings of CSS or HTML.)


#13

One huge limitation of the attribute approach (beyond dev ergonomics with quotes) is that attributes are not parsed in a streaming fashion, leading to significant lag when downloading large iframes using srcdoc.

@leaverou’s suggestion would enable streaming HTML parsing for the iframe contents, resulting in better UX.


#14

Fwiw, syntax highlighting in code editors and IDEs does not work with HTML code placed in attributes. Also, contents of an attribute cannot be directly dealt with as a DOM subtree.


#15

Also, contents of an attribute cannot be directly dealt with as a DOM subtree.

To be fair, neither can inert subtrees, like the contents of <template>.


#16

Looks like a thing to fix. Do you know the reason why it has been decided to introduce such limitation? Fwiw, this probably does not match reality for browsers that don’t support such elements as the template and therefore treat their contents as regular DOM subtrees, unlike newer browsers.

Anyway, syntax highlighting in code editors is an important useful feature that should work with contents of such elements unlike attributes.


#17

Unfortunately, nope. This violates the security constraints that led to the development of srcdoc in the first place. Namely, correctly munging raw hostile text so that it can’t escape the sandbox is relatively difficult in a full HTML context, but is very easy for attributes - just escape the attribute quoting char you’re using. (And any ampersands if you want to prevent data corruption, but that’s not a security issue.) And on the JS side, using iframe.srcdoc is identical to iframe.innerHTML, but the former is safe with zero effort (no escaping required at all) while the latter is just as difficult to secure.

A lot of thought went into common failure patterns in designing secure systems for handling hostile text, and this was the best thing we could come up with.

You’re not hand-authoring these, so what’s the difference to you between putting it in an attr and putting it in the contents? It’s just placing the emitting function a few characters one way or the other. If it’s later debugging with DevTools, that can be fixed on the DevTools side (probably similar to how Shadow DOM is exposed today).


#18

I’m not suggesting that srcdoc should go away, I’m suggesting that this should also be available. Whoever is affected by the concerns you describe can use srcdoc as an attribute. Using srcdoc for sandboxing is only one out of multiple use cases.

You’re not hand-authoring these, so what’s the difference to you between putting it in an attr and putting it in the contents?

That’s a fairly big (and false) assumption there. What makes you think that nobody ever hand-authors srcdoc?!


#19

I’m not suggesting that srcdoc should go away, I’m suggesting that this should also be available. Whoever is affected by the concerns you describe can use srcdoc as an attribute. Using srcdoc for sandboxing is only one out of multiple use cases.

That’s literally everyone. Give people a secure and insecure way of doing something and some people will choose the insecure one and get rekt. Particularly when the insecure one looks more “natural”.

That’s a fairly big (and false) assumption there. What makes you think that nobody ever hand-authors srcdoc?!

What are you doing that needs hand-authored iframe contents? The entire point of srcdoc is to help sandboxing; that’s why it was designed and implemented.

(Regardless, as @Dominic_Cooney pointed out, you can put newlines in attrs, so the only thing awkward about hand-authoring srcdoc as opposed to raw HTML is that you have to avoid or escape the quoting char you’re using.)


#20

So “hostile text” cannot be put into the attribute as is anyway, and there is no difference in terms of security between attribute and element contents.