A partial archive of discourse.wicg.io as of Saturday February 24, 2024.

A small issue I have with PWA’s

OpenLocationServices
2018-06-05

Our project is Ads supported and so are many other things that are offered for free. On Android, once our app is downloaded, no other app can interfere with our app (atleast that is what the PlayStore rules are). In the browser, especially on the AdBlock browser, it blocks Ads. One of the core reasons to make users to install native apps is because PWA’s and websites can have their Ads blocked. And other apps can interfere with how our products and services work.

Would it be possible to provide a nice and guaranteed way that our app is in fact downloaded so that our Ads don’t get blocked? The app is entirely a web view on the PlayStore, and though PWA’s reduce install friction, there is a problem with growth of AdBlockers too.

liamquin
2018-06-05

Our project is Ads supported and so are many other things that are offered for free.

[…]

Would it be possible to provide a nice and guaranteed way that our app is in fact downloaded so that our Ads don’t get blocked?

It’s true that ad blocking is hurting the ad-supported business model. Whether that’s good or bad depends on who you ask, in large part because of too many Web pages with too many ads demanding attention.

One approach you could consider is native advertising – if you have enough users, affiliate marketing and paid content can get past the blockers and if well-chosen can also be relevant and helpful to your users, the sort of ads that inform people about products & services related to what they’re doing.

My own view is that ad blocking add-ons are a symptom of a more fundamental problem on the Web that we (W3C) should address, which is why we have an Improving Web Advertising Business Group, although it has been difficult to make progress. Without advertising it’s hard for small companies to get discovered.

However, changing the Web so that ad blocking doesn’t work is a non- starter today for sure :slight_smile: It’d be better if the free ad blockers always let the first ad through, perhaps.

Liam

Garbee
2018-06-05

Ad blocking extensions in the browser is also hardly the primary issue. They are just a simpler method of doing the task. The ad blocking can also be setup on routing software, or by using a local software proxy on a host machine. The proxies would do what the router ad-blocking does (which is the same thing browser blocking does) which is look for a list of known IPs and/or domains and reject outgoing requests to them. The router/local proxy systems work regardless of web or native.

The primary issue is annoying and excessive ads. Users are simply fed up with it. As well as the tracking some of the ad companies do across the web. It honestly scares people what they are capable of with all the data.

It hurts that good sites doing ads right also get caught up in the blocking. But, if there were some way to prevent blocking well everyone would just use it. Because, every organization/company using it is good right? There is a much deeper problem here and attacking the surface area and telling users to “just deal with it” when it is something they feel is being abusive isn’t a good thing to do. Instead, as @liamquin pointed out there are some groups working to try and handle the root cause of the problem. That’s what should be focused on over attacking symptoms.

isiahmeadows
2018-06-05

It’s true that ad blocking is hurting the ad-supported business model. Whether that’s good or bad depends on who you ask, in large part because of too many Web pages with too many ads demanding attention.

One approach you could consider is native advertising – if you have enough users, affiliate marketing and paid content can get past the blockers and if well-chosen can also be relevant and helpful to your users, the sort of ads that inform people about products & services related to what they’re doing.

My own view is that ad blocking add-ons are a symptom of a more fundamental problem on the Web that we (W3C) should address, which is why we have an Improving Web Advertising Business Group, although it has been difficult to make progress. Without advertising it’s hard for small companies to get discovered.

Also, it’s not like browsers couldn’t allow extensions various escape hatches anyways, regardless of what the spec says or (dis-)allows. No matter how hard ad companies try evading ad blockers and spam filters, it’s just an uphill battle for them - this is only going to get worse for them as browsers start shipping with ad blockers built-in (FF + Chrome + Safari all three do this) and machine learning gets involved (Brave).

So in general, no matter what the W3C tries to do, implementors are more concerned about their users, and users aren’t fans of intrusive, invasive advertising, so there’s clearly a problem that needs addressed.

It’s not like users haven’t tried working towards a compromise, however:

  1. The “Acceptable Ads” campaign was started initially by a partnership between ABP and Google. It solved the intrusive content problem, but not the tracking problem.
  2. Brave has attempted to solve the invasive side indirectly with the “Basic Attention Token”, although it separately addresses the intrusive side using an advanced ad blocker of its own.

(This is what happens when you look at solutions instead of problems.)


Our project is Ads supported and so are many other things that are offered for free. On Android, once our app is downloaded, no other app can interfere with our app (atleast that is what the PlayStore rules are).

Technically, this is true, but if you have an app-level firewall, this is no longer the case (you can just filter the undesired domains away). :wink:

The main reason I block ads unconditionally is different from the average end user, however - I don’t trust 99.99% of advertisers (not even Google) to properly vet their ads to not contain malware, and advertisements, if you can run custom code with them, is a very convenient vector to pwn a lot of machines with very little effort. Now, if an agency allowed site admins to only show HTML/CSS-only ads with <iframe sandbox> (they can just route the links through their servers for tracking, like what Twitter does), and a site was open about that limitation with ads, I would have less of an issue enabling them on that site. But sadly, not even Google provides the ability to do this as an option.

(Intrusive and annoying ads are only part of it; it’s the very fact they aren’t even reliably safe to execute is my issue.)

OpenLocationServices
2018-06-05

Absolutely agreed! But the point is that our app has only 1 banner. We are being punished for the mistakes of others. We only serve targeted ads, because they are more efficient and are relevant to the user. The median loading time for our ads is 600ms. The loading time for our page is 400ms. DomcontentLoaded is 100ms. This is good enough by any standard. We even make sure our partners do not serve any banner on mobile heavier than 10kb. On first page load, the ad JavaScript is 100kb. Than it is cached across every other website.

There are only 3 ways to remain profitable.

  1. Serve more ads

  2. Service fewer but targeted ads

  3. Paying for the service

When you get a newspaper, the ads are based on your location. If browsers could tell us what area of city you live in, and not pinpointing exact location, we would not even have to track you. We would only serve you ads from nearby local businesses. But than that is also a privacy concern for many.

Would you be fine if a non-profit data protection organization served ads? We have a problem with the ecosystem.

If everyone is happy with tracking locally on your computer, but your data not going to third parties, I have a solution to this problem. Will discuss it in a few minutes.

liamquin
2018-06-05

What if we want the benefits of a PWA (reduced install friction) with the benefits of a native app? […]

I mean Android’s policy is bizarre. They allow web browsers with adblockers, but do not allow any apps to block ads on another app.

Any OS that supports Web apps as if they were native applications would likely also support the necessary hooks for the use of adblock in those apps.

However, changing the Web so that ad blocking doesn’t work is a non-

starter today for sure :slight_smile: It’d be better if the free ad blockers always let the first ad through, perhaps. [/quote]

There is a slight problem with that approach. Why would anyone volunteer to disable something that gives them convenience unless forced to?

If Web sites knew they could generally only get one ad through per page, they’d want it to be a good one. And it’d motivate people to upgrade their ad blockers, so a better business model for them too. Today they are ransomware - Web site owners pay adblock for their ads to be allowed through on a whitelist.

At any rate our idea is to explore possible alternatives.

For your own issue, though,

Liam

isiahmeadows
2018-06-05

First of all, important note: my opinion isn’t the usual one, and it’s a lot more nuanced. I relate a lot more to Brave than uBlock’s philosophy.

Second, in response to this:

But the point is that our app has only 1 banner.

That’s not going to be very effective in grabbing people’s attention, trust me. Just a forewarning on that one nit.

Would you be fine if a non-profit data protection organization served ads? We have a problem with the ecosystem.

My concern isn’t who, but the safety and efficiency of the ads themselves. If ads were like newspaper ads, I have zero issue with them. Even a little CSS animation isn’t a major issue, as long as they remain isolated to their frame. The tracking, however, tends to get computationally intensive across various providers (even non-malicious ones, sometimes), and malicious actors love to take advantage of the economy of scale in hiding their slipping of malware into their ads.

I don’t have an issue with tracking clicked links (Twitter itself does this, as do most URL shorteners), but I’d rather tracking page views restricted to things like pixels, CSS media queries, and such, which are easily blocked by privacy-conscious users, but not always by the general user base. You can still gather data based on the content that’s being viewed, as well as screen resolution (via CSS media queries + background-image on a pixel), but you don’t get the vast amount of invasive data JS can provide.

If everyone is happy with tracking locally on your computer, but your data not going to third parties, I have a solution to this problem. Will discuss it in a few minutes.

What normally gets users concerned is the fact most advertisers either send lots of data to third parties or they keep the vast data for targeted advertising that’s accurate enough to spook them. For the first reason, most major ad servers (like Google) don’t give advertisers much data on who can target what, but most of the industry has been reluctant to accept the second, despite losing human traffic to that independently of the ad blocking issue.

If an ad provider uses a script to place and monitor ads, that’s okay, too, as long as they’re open and explicit in what it does, and they limit the ads themselves to keep them HTML/CSS only. (My main issue with trackers is that the ad agencies themselves tend to install more tracker scripts than they need, when a few pixels with media queries is all they need.)

But the issue as it stands IMHO fall in the hands of two groups: ad agencies (those who develop the ads) and ad servers (those who serve the ads, like Google). The former is the one doing all the questionable behavior, and the latter is who’s enabling it. If the latter would start placing much greater restrictions on the format of various ads, many of the bad actors in the former would be instantly crippled (those creating all the malvertisements and ad-driven miners), while most everyone else would just need to adjust. And even on native mobile ads, they could do similar and prevent most of the ad-driven infections overnight.


As for why I recommend specifically disallowing JS in ads (short of the ad server itself), it’s mostly for security and performance reasons:

  1. The data some trackers and session replay scripts collect is not merely highly invasive and unethical, but blatantly illegal.
  2. Several trackers attempt to obscure their origin from both web masters and users, while collecting as much data from the page as they possibly can.
  3. Some trackers even attempt to collect passwords on other sites, whether due to unwitting site operator mistakes (who rarely bother to read the fine print until it bites them) or due to complete and utter disregard for the security of a site’s security.
  4. Many adult sites, online marketplaces, social sites, and other sites where data is critical for user retention tend to be so flooded with third-party trackers (when they allow them) that they begin to bog down and take a crap ton of memory and CPU time just to compute everything and manage their frequent ping-backs. This isn’t really an issue with Google (who starkly limits third-party ad serving and more recently, third-party tracking), but it’s a mild issue with Facebook and Amazon, and a major issue on most free/freemium porn sites.
simevidas
2018-06-10

Could someone confirm that ad blocker extensions are active when the user opens a PWA in full-screen mode from the home screen?

OpenLocationServices
2018-06-11

Not on Android since the apps are installed as WebAPK. This is the correct behavior. On the desktop PWA this is not the case as of now.

isiahmeadows
2018-06-13

I was speaking of things like mobile firewalls when I said ads could be blocked on other apps. Altering the display of an app gets your app banned, but stopping network requests don’t.

For one example of a mobile firewall, there’s this.

It’s not like you couldn’t engineer similar for desktop; I think it’s mostly the fact nobody has really cared enough to write such a blocker, since ads on otherwise non-malicious desktop applications are typically 1. properly isolated (the OS doesn’t sandbox desktop apps, which forces app devs to care about this kind of thing), and 2. much less common. Also, desktop apps less frequently have ads to start, and are typically paid for up front if they’re not free/open source. If anything, ads on desktop are a bit jarring because of the wide presence of adware and other malware spread through malicious ads and programs in general, which leaves a pretty obviously bad taste in people’s mouths. (People can tolerate ads on web and, to a lesser extent, mobile, but desktop had a massive infestation that only started dying down once infosec people finally started getting through people’s heads that you can’t trust whatever you find on the Internet. Sadly, this hasn’t fully translated to mobile and Web.)

isiahmeadows
2018-06-13

Okay, in reply to the tracking question, I feel this particular proposal is probably the best attempt at addressing the tracking issue I’ve seen in quite a while, if not ever. If ads were obligated to use that, it’d end 99% of all tracking issues that exist today, and additionally, it’d avoid 99% of the need for JS in ads, making the few that want JS specifically kind of suspect. (Ad distributors could then transition to <iframe sandbox>ing all their ads, so they only have explicit, browser-aware tracking, which conveniently also makes GDPR auditing and compliance easier for site admins who do things in Europe.) I’d personally love it, since I am okay with tracking errors, but not behavior, and it’s hard to kill the second without killing the first. Giving users the opportunity to discriminate between these two would be fantastic, and it’d be even better if I can also enable ads without the tracking requirement.

My issue isn’t the ads themselves, but what people do with them. Give ad companies a means of easy built-in tracking support with forced transparency about who’s tracking what, block JS execution to prevent malware propagation, and then block the now-shady means like tracking pixels (if they can track openly, tracking secretly to dodge awareness is incredibly suspicious). This alone would cripple the problem actors overnight.

ChrisP
2018-06-13

@isiahmeadows

There is a reason ads use js. 30% of the js in ads is used for actual tracking, 70% is for fraud detection. Extra metrics are needed to detect fraud. There are bad people in the ecosystem, having only one way to measure user behavior to filter out abnormalities mean that online ad fraud becomes more easier.

isiahmeadows
2018-06-13

Okay, fair enough. I do believe that there are potentially solutions for that, too, and those are things browsers could similarly assist with. I think the biggest complication is that you’ve got to figure out a way to enable browsers to try to detect whether their user is human. One advantage they have is that they could get much more intrusive than even a script - they could disable analytics for headless as well as discern a bit better whether a mouse is being controlled by a program, even without it being on the screen. They could also detect things like AutoHotKey and similar running on the system, thwarting it far more thoroughly.

I know this could become an issue analogous to DRM for user-vs-bot detection beyond the basics, as happened with video/movie streaming, but it might become necessary at least initially to get it moving, since I’d be shocked if patents aren’t involved at least somewhat. Unlike DRM, the closed-source nature is incentivized by actual good work OSS people actually want - nobody is going to object to built-in ad fraud detection, since it costs not only the ad companies, but the consumers themselves.

I’d still like the ability for browsers to enable ads again without the security issues they’ve been plagued with for so long, since it works out a compromise for ad companies and the users. The main reason driving people to disable ads are they don’t trust the ad publishers to keep the ads clean. I know tracking doesn’t have to be super invasive (consider site analytics), and fraud detection shouldn’t have to process all of its data remotely to get the job done. But I think this can be solved without crippling ad companies in the process.


For precedent, we’re already starting to see a few ad blockers starting to use some of the same techniques fraud detection is, since many legitimate advertisers are now sometimes seeking paid content (which is often more costly), while most malicious ones are just trying to play cat and mouse against ad blockers. Who says we can’t squash the primary playing field for malicious advertisements and scripts without allowing legitimate advertising to regain its optimal spot, using similar means to how we’re already starting to cripple the malvertisement industry?

ChrisP
2018-06-13

@isiahmeadows

Any such built in fraud detection measures as you say cannot be open source by nature. Fraud detection solutions rely of the fact that bots give us some hints that they are bots, which they themselves do not know about. Such as how they move the cursor, scroll the page, the public facing ip address.

The same is the issue with ReCapcha. It uses similar means to separate bots from humans, including prompting for a capcha when the calculated risk is high enough. ReCapcha like things are not suitable on every page because there is a difference on how people sign up vs how the interact with ads or the page.

Garbee
2018-06-13

This isn’t feasible. For all intensive purposes, headless is roughly equal to a headed browser. The browser has no idea why it is in headless. It very well could be a developer built a program that uses headless then relays the information from the page to the user in a more accessible way. So, if they disabled things in headless just because they don’t feel it matters, you can then end up hurting site developers more than helping in some contexts.

This is once again not feasible. Attackers can more easily than ever isolate the instances of browsers running from other programs. But, still have them interact with each other.

isiahmeadows
2018-06-13

This isn’t feasible. For all intensive purposes, headless is roughly equal to a headed browser. The browser has no idea why it is in headless. It very well could be a developer built a program that uses headless then relays the information from the page to the user in a more accessible way. So, if they disabled things in headless just because they don’t feel it matters, you can then end up hurting site developers more than helping in some contexts.

First, you can disable something without making it immediately visible to the script - “disabling” here implies making it a no-op, and that’s been the implication from the start. Second, browsers have this thing called command line flags. Just like they can use those to, say, disable GPU or modify JavaScript execution, they can also use those to alter DOM behavior.

Edit: Turning analytics tracking to a no-op is also helpful for testing, where analytics is pretty much useless and error reporting is your console output.

This is once again not feasible. Attackers can more easily than ever isolate the instances of browsers running from other programs. But, still have them interact with each other.

If they’re smart enough to run it in a VM, with the browser inside and the script outside, or something like Qubes OS, with the browser and script in different qubes. But even that’s not possible very widely in botnets - most IoT devices don’t support hardware acceleration of virtualization at all, and support for virtualization without hardware acceleration, even in OSS programs, is practically non-existent except for ancient versions with known, critical process isolation bugs. But nothing here is in opposition to building user profiles and other traditional means to supplement fraud detection, only to aid it in unison.

I’m currently privately working on something that tries to more cohesively look at the real threat model, and I’m taking all these into account. I didn’t really seriously research that much before typing the first reply to this or the other related thread I linked to, so please keep that in mind. :slight_smile:

If that research comes up with something promising, I’ll see about posting back here at some point in a new topic.

Garbee
2018-06-13

But in context your’e saying there is never a reason for it to happen in Headless so the browser should no-op it always. Where in reality, there could be very good reasons for those things to still happen since the user is still in control and getting the content.

Yes absolutely. These are opt-in controls to modify behavior. If a CLI flag exists to no-op things, that’s fine. But no-oping stuff shouldn’t be a default for headless.

Or Docker containers, Flatpak’s and Snaps on Linux do isolation of processes, and other methods like Sandboxie could also possibly allow for enough isolation to not allow the browsers to look outside of itself.

In any of these cases, simply running a UA in isolation doesn’t equate to being bad either. A user could just be more security conscious and operate everything in that manner. No, this isn’t meant to be the only detection flag, but it is a fairly poor quality one to use.

isiahmeadows
2018-06-13

Note my edit. I clarified it to narrow the use of when virtualization helps, and I dropped an edit to clarify already that headless doesn’t always equal bad. More to the point, dropping analytics can just generally be helpful for testing, and 99% of headless work is pretty useless to track apart from error reporting, which is still printed to the console. You can sometimes detect headless already by checking window size - it’s not fully hidden, nor was it intended to be. (The primary use cases are testing and crawling.)

isiahmeadows
2018-06-13

Also, I’m aware that running a UA in isolation doesn’t equate to bad - even running Tor isn’t inherently bad (Cloudfare has already had to deal with the even greater FUD concerning that). But I wouldn’t rate it as zero-confidence, since it indicates they are clearly trying to evade something, especially if there’s other hints of automation. If anything, it just amplifies existing signals, rather than being a direct signal itself.

Garbee
2018-06-13

Yes, evade security exploits that could be used to attack their system or them personally. That’s one thing. It doesn’t however indicate that they are always trying to evade bot/fraud detection.

But what exactly is being automated is the question. And to what purpose. Not, just that some form of automation is occurring. Once again, an accessibility tool could be automating many things in an isolated browser instance. Your system shouldn’t throw a false positive which prevents users from accessing content they otherwise could have if they didn’t have an accessibility need.

I agree with this, but only to the most minor degree of amplification.