Advertising to Interest Groups without tracking

Tags: #<Tag:0x00007f4856d9d988>

As I am a member of the W3C Advisory Board, are you asking for the AB to review this to provide an opinion? I am happy to ask them to do so, or any W3C Member can create an issue at https://github.com/w3c/ab-memberonly/issues. Feel free to email me, cwilso at google, if you’d like me to file it instead.

I should preface this with: I’m a member of the AB, but not speaking for them; I’m Google’s AC rep, but not (in this instance) speaking for Google in regards to the actual proposals, as Michael is better informed. I’m a co-chair and founder of the WICG, and do have some role in speaking there, along with Marcos, Yoav and Travis.

The point of incubations is to let multiple potential avenues be explored, in the open. There have been incubations that “compete” before (conversion metrics comes to mind), there will be in the future; that’s by design. Incubations do not represent the consensus of the entire community. Community Groups and Business Groups both are not suited for representing such, as they are not do not function under the W3C consensus Process. Any incubation developed here would be expected to move to a Working Group in order to represent the consensus of the W3C (which would represent a blend of user, advertiser, developer and vendor needs).

I would point out that Michael’s expressed goal in bringing this here was to bring this area to the attention of more than just the membership of the Web Advertising BG - not to remove it from the view or engagement of the WABG. This work should eventually move into an (as yet uncreated) WG at the W3C, I would expect, and would have to do so even if the WABG had reached a complete consensus among themselves on a precise proposal.

Thank you all. I’d like to wait a bit and give Criteo folks a chance to speak up here also — my summary is already out of date, since they updated SPARROW with a new Reporting section earlier today.

@marcosc and other chairs: If the SPARROW folks also want to advance the two proposals together, I guess the obvious mechanism is to move them both into WICG independently, and then pick one as the base to update with consensus decisions. Is there anything cleverer, or any reason to look for such?

If we’re to build a more trustworthy and sustainable web, which meets all the values and goals of the W3C, we need to define what success looks like. That includes questioning and validating the anti-tracking stances of browsers which might work against the goals of an open web for all.

The anti-tracking stances of browsers are up to those browsers, and of course anything may change. But at this point I think there is considerable evidence that lots of ads are going to be shown in environments where tracking isn’t an option, for whatever reason — browser efforts, regulatory environment, user consent response, etc.

The point of this proposal is to further the state of what’s possible even if tracking individuals isn’t available. @jwrosewell I would think that a worthwhile goal, no matter what your personal opinion on browser privacy policies.

1 Like

Whilst some contributors may have the time and mandate from their employers to incumbate many parallel experiments to address the same problem, most participants do not. Establishing a common view of “success” and then evaluating competing early stage proposals, eliminating those that are weaker, will help everyone focus and use their time efficiently. This will expedite progress.

There is already sufficient information available about multiple proposals to perform an assessment of the likely impact on multiple stakeholders. As an example; in the case of TURTLEDOVE and SPARROW a major difference relates to the role of “gatekeepers” and client versus server side implementation. These proposals do not need to be developed further to identify and resolve these major differences.

I maintain that the esablished norms of governance require such a process to be undertaken. I’m struggling to understand why others think this established practice does not apply in this instance.

I would welcome @michaelkleber, @marcosc and others explaining the method they would use to evaluate competing proposals. Many other W3C participants have already provided a proposal in the form of the success criteria document being drafted within the W3C Improving Web Advertising Business Group. The feedback recieved on the most recent meeting is being incorporated into the next revision of the document for presentation next week. I hope it will better reflect the needs of browser vendors. If you can provide your methods here I will be able to consider incorporating them ahead of the next meeting.

@cwilso Thank you.This thread exposes a number of issues which I would like the W3C AB to consider holistically. I observe in the minutes of the May AB meetings some of these topics were discussed but not resolved. I believe the AB are seeking the opinion and viewpoint of a diverse set of members. I will prepare a concise summary of the issues for the AB to consider, post a duplicate here, and otherwise progress as you suggest.

On a different matter I’ve observed four of my comments have been flagged. Having read the community guidelines I’m failing to find any cause and therefore have no remedy available to me. Could you advise on the explicit reason or remove the flag?

The flagging was automatic (new users that post a lot of similar links are flagged by the system). I believe it is now resolved.

1 Like

Is there anything cleverer, or any reason to look for such?

Moving both sounds fine to me.

Hi,

SPARROW contributor here.

We’re happy to move forward with WICG as long as the criteria to reach consensus are explicitly shared and acknowledge by all parties and reflect the interest of the open web ecosystem and their users.

2 Likes

Curious meta question: is Brave involved with this? I feel they would definitely at least be interested, as this is basically their entire business model.

Brave’s Pete Snyder expressed interest during this past week’s meeting of the W3C Privacy Community Group, noting the similarity to some of their work (minutes). Also I’ve had some conversations with ex-Braver Tom Lowenthal about variations of the idea here.

But they haven’t had direct involvement with this particular proposal; Brave is not active in the Web Advertising Business Group, which this grew out of.

1 Like

@ablanchard1138 The W3C’s process around consensus and dissent is described here: https://www.w3.org/2019/Process-20190301/#Consensus.

The WICG charter says a little more on the subject, here: https://wicg.github.io/admin/charter.html#decision. The charter goes so far as to consider producing two different proposals, if substantial disagreement remains. But since we’re coming in with two different ideas which we hope to unify, that’s obviously not the goal!

3 Likes

We’re happy to have SPARROW moved to the WICG, get feedback from and have discussions with the community.

Lionel for Criteo

@cwilso Thank you for your advice. Myself and other W3C observers / commentators opted for email to raise the more general issues of governance and trust choices.

Since first commenting the CMA have issued their final report which contemplates a common user IDs as a remedy to competition issues in the digital advertising marketplace. If the W3C were to embrace this recommendation then the need for this proposal, and all associated work would not be required. It would likely result in a focus on transparency and control proposals.

If a better method of filtering proposals at the conceptual stage existed W3C members and stakeholders would collectively save a lot of time and effort.

At RTB House we find both proposals addressing the needs of eCommerce advertising investment supporting a recognizable fraction of the web.

We would like to introduce the product-level perspective extension to both proposals backed with an estimation of a high-level impact on CTR. A detailed description of “Product-level Trutledove” can be found under this repository: https://github.com/jonasz/product_level_turtledove

We invite all feedback, and we’re looking forward to further discussion!

We are also happy to hear what is the practice of extending current proposals on WICG

While I agree that the W3C should pay attention to the CMA report because anticompetitive issues have a history of harming the open Web, we should be cautious not to mix up remedies, and to consider who each component of the ecosystem works for.

The CMA’s report includes common user identifiers as a remedy to anticompetitive practices in electronic ad markets that trade in personal data. From that angle, their recommendation is very sensible. The manner in which electronic ad markets are currently operated by large players would be considered illegal in other electronic markets (notably finance) and I believe there is a very good case to be made that it should be illegal here. To the extent that information enters an electronic ad market, no party should be allowed to self-preference or use it for insider trading.

However, that is the perspective from competition and market policy. The question it answers is: “when information is traded, how should that trade be structured?” What it does not address, however, is the privacy side of the equation, and it does not purport to speak for users. Put differently, it does not address the question of whether a market in personal data should exist at all.

Users are, quite overwhelmingly, clear on what their preference is here. Just citing Eurobarometer, 89% of users expect their browser not to share data to third parties. It is the browser’s job — literally, it’s pretty much it’s one and only legitimate job — to be the user’s agent. That’s why it’s called a user agent: it could be argued that it has a fiduciary duty of agency with respect to the user. Web standards are equally held to this by the Priority of Constituencies which puts users first.

A default that supports third-party tracking is a user-hostile default. A browser that enables third-party tracking by default (or that uses telemetry for purposes other than its own betterment or the Web’s) is a browser that is failing in its fiduciary duties.

To return to the CMA report: when there is a market it has to be fair, but that does not mean that a market should exist. To make a comparison, to the extent that there is a market in organs it should be structured to prevent a single player from using insider information to tilt it in its favour. But that doesn’t mean that the market should exist in the first place. By the same token, if the electronic ad markets switch to trading contextual signals, it would be unfair to use advantage from a proprietary aggregation format or browser telemetry to tilt that market.

In your AB letter you claim that users should be allowed to choose to be tracked across their digital lives. I personally doubt that they would be interested, but that’s just one person’s doubts. If that statement is true, then there is a way to prove it: why don’t you develop a browser extension that users can install voluntarily, that exposes an identifier to the page, that has terms making it clear that sites cannot force users to use it (otherwise it wouldn’t be consent), and release that to extension stores? Technically, it is not a very complex undertaking. If it sees substantial uptake, it would prove your point the way no appeal to argument could.

2 Likes

@robin Thank you for the good feedback, specifically in relation to the important topic related to the need for a market.

I understand from an article published in May 2020 New York Times (NYT) believe a market in audience segments informed by personal data and identity should exist.

Your colleague Allison Murphy, Senior Vice President of Ad Innovation, is quoted in the article.

This can only work because we have 6 million subscribers and millions more registered users that we can identify and because we have a breadth of content.

This quote acknowledges that the barrier to enter such a market requires a large volume of subscribers and registered users. The article suggests a strategy that is independent of TURTLEDOVE, and other similar cohort based proposals. As such NYT future profits and viability are not dependent on the outcome of the proposals that centralize interest-based marketing into the browser.

Smaller publishers will need to “band together” via the use of suppliers (aka “third parties”) to operate the significant scale needed to compete in this market. If there were not value to a publisher, large or small, in such a market then why would NYT publicly endorse it?

I hope you would agree if small publishers and new entrants are excluded from such a market they will be disadvantaged. The likely outcome is that the variety of information and services available to people will diminish. Access to media will become ever more restricted. This is a key observation of the CMA report which impacts people and society.

The W3C’s purpose specifically prohibits the creation of standards that would lead to such an outcome. The W3C’s governance model needs to be modified to address this problem. The first step is for everyone to agree there is a problem. It is with this in mind the AB letter was sent.

@robin In relation to people’s trust choices and the role of the user’s agent.

The legal framework I’m most familiar with in relation to privacy is GDPR. GDPR does not seek to prohibit choice. A browser vendor that builds a user’s agent must ensure their product and service complies with the law.

Microsoft, among many others, support such choices in their products. The follow is an example of the user interface available to all Microsoft subscribers.

Any change or new feature that impacts so many stakeholders is a policy decision for the proposer’s company and as such should be subject to a great deal of scrutiny.

Fortunately, the impacts of these proposals can be evaluated prior to implementation. Myself, and a number of other stakeholders, have produced a set of success criteria and a self-questionnaire in the same format as other W3C documents to support such a review process.

When a dominate market player progresses a proposal to implementation and trial without acknowledging the impact on stakeholders and identifying appropriate mitigations to these impacts, they are asking many other companies to invest significant amounts of time, which many smaller companies cannot afford.

Unilateral decision making also sends a signal to the market that such a dominate market player has a preference for their own solution and that other proposals are unlikely to receive consideration. This is a problem for the consensus structure of web standards governed by the W3C and should be acknowledged.

These are all examples of the issues I’m seeking to raise explicit visibility of, so we can collectively ensure changes are improving "one web” for everyone, rather than fragmenting it or moving its control into the hands of fewer organizations.

Hi @jwrosewell,

just answering on a few relatively disparate points:

There is an important difference between trading in personal data and trading in data derived from personal data. There are a few companies out there working on enabling this for small publishers in the same way that larger publishers can build for themselves. If I worked in adtech (and given everything exciting that’s happening these days, it’s certainly an interesting industry!) I think I would focus on that kind of innovation instead of trying to prolong the status quo of mostly doing the same thing as Google but at smaller scale.

This is not a big vs small publisher issue — all sizes of publishers are dying under the current régime, only a few are keeping their head out of the water. Change is needed. Thanks to the current evolution of the data economy we are finally seeing innovation in adtech that is bringing it out of the old unsafe, ungoverned, anything-goes model under which publishers lost control over their core advertising asset — access to their audience. I’m very excited about some of the options I’ve seen being developed by small innovative startups.

I am also familiar with the GDPR. One important part of the GDPR is Article 25: Data protection by design and by default. This does not preclude choice and neither are browsers currently preventing choice. They are simply going with the privacy by design and by default that aligns with their users’ expectations. Note that when the browser vendor makes the decision to process data in a manner that is not essential to support the user’s request and that makes it so that the browser is determining the means and purpose, it is arguably a data controller.

The open programmatic ecosystem carries well-known data protection risks since it essentially broadcasts data to a large number of participants with no purpose limitation. I have no objection if users choose to enter their personal data in such a market, but they should do so in full deliberation. This means that the manner in which they decide to participate should be commensurate and well balanced with the risks to their data protection. Things like the TCF and CMP dialogs fall very short of the mark there. But as I suggested above, nothing prevents a company or a group of companies from implementing an extension that users could choose to install in order to return to being tracked across their entire digital lives if that’s what they want. That would make it possible to provide greater notice, and would give a clear way for them to exercise their rights — something which the previous ecosystem failed at.

I don’t disagree that some browser vendors can be unilateral and inconsiderate of others in the ecosystem (you know who you are folks ;). However, what browsers are doing with cookies is in line with existing standards and has been for a very long time. For instance, if we look back to RFC 2965 §3.3.6, from October 2000, it states very clearly:

   When it makes an unverifiable transaction, a user agent MUST disable
   all cookie processing (i.e., MUST NOT send cookies, and MUST NOT
   accept any received cookies) if the transaction is to a third-party
   host.

   This restriction prevents a malicious service author from using
   unverifiable transactions to induce a user agent to start or continue
   a session with a server in a different domain.  The starting or
   continuation of such sessions could be contrary to the privacy
   expectations of the user, and could also be a security problem.

   User agents MAY offer configurable options that allow the user agent,
   or any autonomous programs that the user agent executes, to ignore
   the above rule, so long as these override options default to "off".

Browser vendors made the unilateral decision, against the standards community, to support third-party tracking by default back then. This decision put all publishers at a disadvantage compared to intermediaries and was a direct contributor to today’s crisis.

If you prefer to look at the more recent RFC 6265 §7.1, it had to accept the reality of third-party tracking but still stated:

   Particularly worrisome are so-called "third-party" cookies.  In
   rendering an HTML document, a user agent often requests resources
   from other servers (such as advertising networks).  These third-party
   servers can use cookies to track the user even if the user never
   visits the server directly.  For example, if a user visits a site
   that contains content from a third party and then later visits
   another site that contains content from the same third party, the
   third party can track the user between the two sites.

   Some user agents restrict how third-party cookies behave.  For
   example, some of these user agents refuse to send the Cookie header
   in third-party requests.  Others refuse to process the Set-Cookie
   header in responses to third-party requests.  User agents vary widely
   in their third-party cookie policies.  This document grants user
   agents wide latitude to experiment with third-party cookie policies
   that balance the privacy and compatibility needs of their users.
   However, this document does not endorse any particular third-party
   cookie policy.

   Third-party cookie blocking policies are often ineffective at
   achieving their privacy goals if servers attempt to work around their
   restrictions to track users.  In particular, two collaborating
   servers can often track users without using cookies at all by
   injecting identifying information into dynamic URLs.

As you can see, what browsers are doing today is exactly what the open standards community has been expecting of them to do for twenty years. Everything from ITP to eliminating 3P cookies isn’t just what users want, it’s what the standards actually say should happen. They took a unilateral detour experimenting with third-party tracking. It contributed to the world of excessive concentration, dying publishers, vanished online privacy world that we know.

I for one welcome them back into the fold. Innovation is much better when it is is aligned with users than when it is hostile to them, and we’re already starting to see these changes bear fruit.

2 Likes

@robin - To summarize where we have aligned, a market for data derived from personal data should exist, and is valued by marketers, which indirectly helps improve publisher revenues. Hence your own company’s investments. Access to personal data is needed to trade in data derived from personal data, which is a prerequisite to operate in that market.

I’m unaware of any companies that are proposing a solution that would enable publishers who lack the scale of the NYT to be able to enter such a market, especially if their access to the input data is eliminated. Could you point out the solutions under development which would support this?

I note your colleague quoted in the article states such solutions are impossible. See the following quote.

“While a differentiator and I’m thrilled about it, this isn’t a path available for every publishers, especially not local who don’t have the scale of resources for building from scratch." says Allison Murphy, Senior Vice President of Ad Innovation, [New York Times].

It’s been a long time since I read those RFCs :blush:. They do highlight how well written IETF documents tend to be, including the clear document history. RFC 2965 took nearly 4 years to become an original technical standard. During that time browser vendors were shipping solutions that become de facto standards. Business models were created around the de facto standard implementation rather than the RFC standard as it was eventually ratified. Web professionals did not have the time to notice the important difference, let alone modify their solutions or business model retrospectively to comply with the standard as ratified.

In seeking to alter implementations to meet the documented standard over 23 years later a lot of disruption occurs. I observe the NYT web site makes extensive use of the de facto standard as implemented concerning cookies in Unverified Transaction. You, your colleagues and suppliers will have to expend effort altering or removing these features. The same will be true of every other publisher large and small. I’m therefore unsure how the de facto implementation “put all publishers at a disadvantage” as publishers, like the NYT, could have chosen not to utilise these techniques within their supply chain.

A governance model which prevents this disruption by operating linearly, as it does in the governance of other technologies used by more than 4,000,000,000 people, is now more important than ever. The example you provide highlights this need perfectly. This issue is at the heart of my original comment in relation to this proposal and a fantastic example of the need for the W3C to change.

I also agree with your comment that “when the browser vendor makes the decision to process data in a manner that is not essential to support the user’s request and that makes it so that the browser is determining the means and purpose, it is arguably a data controller.”

Therefore I think you’d agree it would be more convenient for people to make their choices within the web browser itself, setting defaults at install time, rather than requiring them to download a browser extension to communicate this. Such a model is well understood by people and as I highlighted with the example from Microsoft is used in comparable technologies.

The UK CMA recommend the introduction of a “common user ID” as an appropriate remedy to the dominance of market players. Such a remedy would be implemented as part of the browser.

Ultimately my original comment related to the need for proposers to justify their proposal before progressing with the engineering work to ensure that the concept is a net positive for the web. The bar to incubation needs to be far higher than two people from two organisations agreeing if we are going to address the root cause of the problem.

To help proposers and reviewers myself, and many other contributors named and unnamed, continue to iterate a set of success criteria for improved web advertising. It is open for review. I suspect other concerns of the W3C will need to be similarly documented in time.

@robin – I’m still working through the latest set of comments. Let me know if you would have time for a 121 to time efficiently progress your remaining concerns. This could be co-ordinated with other reviewers.

2 Likes

Hi @jwrosewell,

I’d like to quickly get a few items out of the way to keep the discussion focused:

  • We are not aligned on the idea that trading on information derived from personal data requires access to personal data. The whole point of innovative systems in this domain is to enable this trade without the data being exfiltrated. Presenting this as an either-or is not correct.
  • I am barred from endorsing companies, and I wouldn’t do it in a standards forum anyway — but they’re not being shy about their offerings, I think with some digging anyone can find these innovative companies.
  • I also can’t speak for Allison but I believe that you’re reading way too much into her words so that they align with your expectations. A smaller publisher might not be able to use the exact same method we have been using for this subset of our offering. That is not saying that all targeting options are impossible, they just work differently.
  • I believe that your point that we: “could have chosen not to utilise these techniques within [our] supply chain” is an incorrect characterisation of market dynamics. Because of 3P tracking the ad market has become intermediary-dominated. Publishers are forced to participate on terms set by intermediaries. I have yet to meet a publisher who feels that this market has been built with them in mind. Publishers’ access to audiences is devalued by the removal of scarcity. I don’t think I need to rehash twenty years of publishers being ignored by the IAB and only becoming a convenient consideration when the need to lobby politicians or other standard organisations arises.
  • Indeed, when the Web improves often that requires work and change. Pushing to get HTTPS everywhere is a good example of precedent. That it is work should not prevent us from making progress. The WebKit team has been progressively refining ITP for three years. The Chrome team gave us two years to prepare for the end of third-party cookies. By the time it will have happened the writing will have been on the wall for almost five years. These are responsible time scales.

Having gone through these, I would like to focus on two issues of substance: governance, and publishers.

You mention user choice a lot, as does the letter to the AB. With that in mind, I would like to ask a simple question: of the following companies that signed the letter to the AB, which ones respect the user-chosen DNT signal?

I am not asking to score a cheap point: this is directly relevant to the issue of governance. The W3C has its warts and it’s had its rocky patches, but over the years it has demonstrated pretty solid governance. In regulator circles, I’ve often heard it touted as an example. Let us consider what has happened in adtech governance:

  • The FTC made a deal with the adtech industry that involved adtech producing self-regulation. Twenty years on, nothing whatsoever has happened.
  • The W3C tried to find a workable consensus position with adtech through DNT. I think everyone involved at the time, on any side, can attest that this did not happen in good faith on the adtech side.
  • AdChoices

So I ask with sincere respect, but nevertheless firmly: what reasons would people here have to believe that adtech companies are acting in good faith this time around, what reasons would people here have to believe that there will be strong governance for universal IDs when it didn’t happen with previous identifiers, and is it reasonable to expect any institution to take governance lessons from an industry that has systematically shunned the very idea?

Again, I am not making rhetorical points here. I have spent much of my career seeking consensus and I would love to do it again on this difficult topic. But to put things frankly: the suggestion that we should rely on adtech companies for good governance and entrust them with universal identifiers has a serious fool-me-once problem.

Now to switch to the question of publishers. You keep trying to return to the idea that somehow privacy might be good for big publishers and bad for small ones, maybe hinting that us big publishers have it easy advocating this but we’re not thinking about the little guy.

I think there’s a very simple way to put this: if being irresponsible with personal data were in publishers’ interest we’d be rolling on mountains of cash. The idea that if publishers were only to violate privacy just a bit more with universal IDs, then this time it’ll work doesn’t feel all that credible in the middle of a journalistic mass extinction.

Changes to the data economy are difficult for publishers large and small. Our trade associations are working hard to make sure as many of us as possible make it to the next year, and hopefully more. But I don’t think there’s a case to be made that what we need is more of the same. The solution might not be TURTLEDOVE, it of course has to be dissected and threat modelled, and I’m sure that @michaelkleber expects no less, but we’ve tried third-party identifiers: they don’t work.

3 Likes