XSS prevention in the Browser by the Browser

No, I mean that the developer marks a certain area inside the page where the browser behaves as if JS was disabled globally, for the whole browser. And this kind of control for the web developer is not given by CSP.

I think that relying on just CSP to protect from XSS is inherently wrong.

Because just a small change to the CSP would necessitate a test of the whole page to ensure that XSS is impossible on the page.

The cost of such a security review mounts.

I think if you can make security close to the place where it adds security makes the measure for this security more reliable.

And more modular. Because CSP is enforced on the whole page you need to make a sure your code works, if you imported that from anther project. But if you can directly see what protection mechanisms were used by that particular snippet and that protection doesn’t require you to change the overall protection of the site, you can import that snippet without worrying that it might be broken by your CSP or other global XSS protection.

If you say that CSP does all I propose, I don’t oppose that notion. But if you try to convince me that CSP is the best way to prevent XSS in all cases, you’ll be wasting your time.

Thus even if privileges are not helpful today, locally disabling JS is.

“Locally disabling JS”

I want to explain a little bit more about how I imagine the “locally disabling JS” feature should work.

Use of this

I think that if the browser disables JS in those areas in a page that output user input, XSS wouldn’t be execute after it used a some parse differential in sanitizers and other XSS prevention tools.

How to use it (eventually)

I already made some proposals on the dev-security mailing list by Mozilla. But I’ll make a proposal here again: There should be a function called something like disableScripts. It would take a DOM Object inside the document and disable JS in that:

<div id="output_user_input" 
     onload="disableScripts(document.getElementById('output_user_input')">
<noscript>
{{ user_input | safe}}
</noscript>
</div>

As you can see I used Jinja Syntax to directly insert user input without sanitizing. If disableScripts() works that shouldn’t be a vulnerability here. But still it’d be possible to close the noscript and div tag, what would make it possible to inject XSS again. How would you solve this? Sanitizing I think. But I think that this sanitizing function would be much smaller:

function sanitize(html) {
    if (/</noscript>/.test(html)) {
        //XSS detected, definitively 
       return "XSS-Attack detected";
    } else {
       return html;
   }
}

The advantage here is both the simplicity and that it is without a doubt. Because you do not use a closing noscript tag if you write user input. By the way because JS is disabled I think that the content of a noscript tag should displayed.

I also think that this can be used very efficiently with TrustedTypes.

What about a user extension? Something the user themselves have deemed allowed to operate on a page. Would this disabling then block those? If so, this is a no-go from the start. Users needs overrule site requests.

It seems to me you are opening a vector that can only work in some cases but not in others. CSP works in the best interest of sites and users. Because user scripts can still run regardless of the CSP rules.

Your suggested method of “definitively” detecting an XSS attack is not so definitive given the user-scripts case. Therefore, it isn’t as worthwhile as you lead yourself to believe, nor would it be as worthwhile to developers at large in this context.

user-script case? What is that? I didn’t stumble about that yet.

Do you think that it would be possible to implement this, if it is still possible for extensions to run JS inside those DOM-Elements, where Scripts were disabled by the site?

I think you had better look again at the purifier system @craig.francis referenced when this was first posted. That type of system has a much better chance at being included in browsers over this blunt trauma approach.

Just thinking about this a bit more, while I don’t think there is a way to simply have “no JavaScript in this element” (as noted above), there is a way to say “there is no JavaScript after this point” (e.g. after the <body> starts).

So in my case, I’ve always had a simple CSP that limits where the JavaScript can be loaded from, where it also disables all inline JavaScript.

But after looking at this issue a bit more, and after a suggestion from Daniel Veditz, I’m now adding a second CSP via a <meta> tag, which will disable all <script> tags after that point; e.g.

<?php
  header("Content-Security-Policy: default-src 'none'; script-src https://example.com/js/");
?>
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8" />
  <script src="https://example.com/js/responsive.js"></script>
  <meta http-equiv="Content-Security-Policy" content="script-src 'none'" />
  <title>XSS Example</title>
</head>
<body>
  <p>Hi <?= $_GET['name'] ?></p>
</body>
</html>

More background at: https://github.com/w3c/webappsec-csp/issues/395

If that even works, it’s a bug in the engine most likely and I wouldn’t rely on it. The meta tag should be combined with the header content, and the proper merging be done with rules. Then the results applied to the entire operation of the page. A meta tag in the page shouldn’t be “that point forward” as far as I am ware spec-wise.

You’re possibly relying on undefined behavior, which is a problem. Don’t expect that to always work unless it is supported by the spec directly.

@Garbee, Section 3.3 in the CSP3 spec says: Authors are strongly encouraged to place <meta> elements as early in the document as possible, because policies in <meta> elements are not applied to content which precedes them.

And the only browser I could find that didn’t do this was Internet Explorer, because it ignores Content Security Policies in a <meta> element, which I’m fine with.

OK, as long as it is specified to work that way then it’s acceptable. Although, seems kind-of odd since if you have a meta charset it has to reparse what came before it last I recall. Could be recalling that dive into the specs incorrectly though.

I can’t work it out, but there is an issue with MS Edge 17.17134… as in, when I was testing, everything was working fine, but I had a complaint that the JavaScript stopped working for 1 customer. I could only replicate it by removing all non-essential headers, not using the refresh button (using a link back to the page), and removing all of the cookies for that site (that was fun).

That said, Edge 17 will be going soon - so I’m just going to remove this second Content-Security-Policy for it.


Back to your point with re-parsing, yes, <meta charset="UTF-8" /> can cause the browser to re-parse the content - which is why you should put this at the very beginning of the document (or in the Content-Type header), as it means less work if the browser needs to change the assumed character encoding.

But the HTML 5.2 spec (changing the encoding while parsing) implies this technique should be fine, as the browser should either follow step 5, and “changing the converter on the fly” (no re-parse); or follow step 6, and “navigate to the document again” (not exactly a re-parse).

And because I like to be sure, I used this code to test:

<?php
  header('Content-Type: text/html; charset=');
?>
<!DOCTYPE html>
<html>
<head>
  <title>Charset Test</title>
  <script type="text/javascript">
    console.log(document.inputEncoding);
  </script>
  <meta http-equiv="Content-Security-Policy" content="script-src 'none'" />
  <meta charset="UTF-8" />
  <script type="text/javascript">
    console.log('Blocked');
  </script>
</head>
<body>
  <p>Testing</p>
</body>
</html>

Commenting out the meta charset tag, the browser console should show something like “windows-1252”, Firefox complains about the missing character encoding, then all browsers block the second <script> tag.

Whereas keeping it in, will cause the browser to re-parse (of sorts), the console will now show “UTF-8”, and still only block the second <script> tag.

Unless you’re running MS Edge, in which case I can’t quite work out what it’s doing.

The issue with MS Edge 17.17134…

If you’re using “Content-Type: application/xhtml+xml”, this two CSP technique works (the first script tag loads, the second is blocked).

When you change to “Content-Type: text/html” (because it’s too risky to expect perfect XML on a live website), it will continue to work if you press the “Refresh” button, or press [Ctrl]+[F5].

But using a link to reload the page (while it’s still using “text/html”), then both script tags are blocked.

So the “Refresh” button keeps some information/state about the page in memory, and uses that when re-loading the page (concerning). In this case, it keeps using the XML parser even though the Content-Type has changed (you can verify by adding invalid XML).

And when checking, keep in mind that there is a different bug, where two requests can be sent to the server if you leave the F12 developer tools open.

<?php
  header('Content-Type: ' . (false ? 'application/xhtml+xml' : 'text/html') . '; charset=UTF-8');
?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>MS Edge Test</title>
  <script type="text/javascript">
    document.addEventListener('DOMContentLoaded', function() {
        document.getElementById('output').appendChild(document.createTextNode('1'));
      });
  </script>
  <meta http-equiv="Content-Security-Policy" content="script-src 'none'" />
  <script type="text/javascript">
    document.addEventListener('DOMContentLoaded', function() {
        document.getElementById('output').appendChild(document.createTextNode('2'));
      });
  </script>
</head>
<body>
  <!-- </div> -->
  <p><a href="./">Reload</a></p>
  <p id="output"></p>
</body>
</html>

Why is it OK to expect perfect JSON, perfect JavaScript and not perfect XML? Possibly because it’s harder to test?

It’s true that i’ve seen a lot of systems vulnerable to CDATA injection bacause a developer thought using CDATA sections meant they didn’t have to validate/sanitize user data, but similar problems exist with JSON.

Is it because of different user expectations when editing XML?

I’m probing this partly because it’s better to address an underlying problem than to work around it when possible.

Of course, XML is as vulerable to XSS as HTML, or nearly so, when used for a Web page, since html:script is honoured by browsers, once the html prefix is declared.

Liam

I would prefer to go with perfect XML @liamquin, but browsers do not put in as much effort in to their XML parsers (e.g. flash of unstyled content), and when content can come from a variety of locations (or even browser extensions), mistakes do happen (customers would rather a half/mostly working page, than a yellow screen of death).

That said, I still use XML mode on my local computer during development, and switch to text/html on my Demo and Live servers, as it ensures I don’t make any mistakes in my HTML output.

e.g. Ensuring all attributes are quoted:

<?php
  header('Content-Type: ' . (false ? 'application/xhtml+xml' : 'text/html') . '; charset=UTF-8');
  $user_supplied_value = 'https://example.com onerror=alert(document.title)';
?>
<!DOCTYPE html>
<html>
<head>
  <title>Bad XSS</title>
</head>
<body>
  <img src=<?= htmlentities($user_supplied_value) ?> alt="Profile Picture" />
</body>
</html>