[Proposal] Federated Learning of Cohorts (FLoC)

Is there more support for this proposal than Yahoo Japan?

I don’t think one other company could be considered industry support. Here is a piece I wrote in Adxchanger that more accurately summarises the situation.

Perhaps Yahoo Japan could explain how they’ve been able to mitigate the anti trust issues?

2 Likes

Speaking for myself and in my role as WICG chair, not speaking for my employer.

James, as previously mentioned, the bar for getting IN to incubation at the WICG is intentionally low - that is, there must be SOME industry support. Note that this is actually a HIGHER bar than for the W3C’s Community Group system as a whole, where one merely needs five people (even from the same Member) to start a whole new community group. Incubations are not intended to represent (to counter the point you have espoused repeatedly) a “done deal”; indeed, they are open exploration, so that multiple parties CAN participate in developing solutions.

The bar for starting a Working Group, Interest Group, or moving a specification from a WG to Recommendation status is, of course, much higher.

Stating things like “the overriding feeling among many is that Google has largely ignored the feedback provided by industry and business stakeholders” as if it were fact seems very much like saying “many people are saying”; without data, it’s impossible to objectively find solutions. You don’t like our solutions; great, please participate and try to actually solve the problem of providing marketing tools that do not need to violate users’ privacy. We are actively trying to seek common ground, while you appear to simply be against putting users in control of their own privacy, and attempting to block any attempt to incubation and find solutions that might address the raft of privacy concerns that permeate the Web today. As I have previously stated, blaming Google for the need to find user-privacy-centric solutions to marketing is a red herring; the market (and the other browsers, who have ALREADY made the changes you are attacking Google for) have already spoken on the need to find better user-center solutions.

Your raising of the spectre of anti-trust seems misplaced, as the pointer you yourself provided in your article makes clear the need to have open collaboration - which your attempt to block discussing FLoC or other privacy-focused tools for marketing would, in fact, prevent.

Perhaps Yahoo Japan could explain how they’ve been able to mitigate the anti trust issues?

Mentioning anti trust at this early stage of incubation seems premature, and as such it is possible that the statement could be perceived as intimidating in nature.

Whether this was the intent or not, please remember that this CG operates under the W3C Code of Ethics and Professional Conduct).

Thanks.

You don’t like our solutions; great, please participate and try to actually solve the problem of providing marketing tools that do not need to violate users’ privacy.

See the work of Project Rearc and the IAB ‘Post Third-Party Cookie’ Taskforce where I’m an active participant. At the W3C I edited success criteria for improved web advertising and a questionaire to help proposers understand the wide impact of a proposal. These groups involve the Partnership for Responsible Addressable Media (PRAM) and many other organisations from many sectors providing far greater diversity than I’ve seen in WICG.

We are actively trying to seek common ground, while you appear to simply be against putting users in control of their own privacy, and attempting to block any attempt to incubation and find solutions that might address the raft of privacy concerns that permeate the Web today.

The UK Competition and Market Authority have recommend pseudonymous identifiers. The EPC and PRAM also favour a pseudonymous identifiers. Connected TVs all have pseudonymous identifiers. I’m still reading the EU Digitial Markets Act [1] published yesterday. I find footnote 1 interesting as it reads “Such tracking and profiling of end users online is as such not necessarily an issues, but it is important to ensure that this is done in a controlled and transparent manner, in respect of privacy, data protection and consumer protection.” I agree with this sentence. In the case of the UK I believe my elected parliamentarians should decide what is right for people and society. They have an opportunity to do that via the introduction of the Digital Markets Unit legislation. They must have the chance to make these decisions. That is why I started Markters for an Open Web to ensure they get this opportunity.

Perhaps the question should be why does Google pursue solutions that place more personal information into the hands of a smaller number of US trillion dollar oligopolies and in doing so advance a direction that does not appear to be compatible with the position of regulators and a broad representation of web stakeholders?

In relation to Antitrust I have raised a number of urgent issues in relation to the W3C Process and the Membership Agreement. Of particular significance is the fact that WICG (or any community or business group) cannot create standards. Therefore the antitrust protections afforded the standards process do not apply. Advancing a proposal in this forum that is anti competitive comes with risks and I’m interested to know how these are being mitigated.

[1] https://www.mlex.com/Attachments/2020-12-15_M89U43IH3191RFXG/201215 proposal-regulation-single-market-digital-services-digital-services-act_en.pdf

2 Likes

Please keep legal allegations out of technical discussion. It’s acceptable to ask how competitors can participate in the technology being discussed and to use these proposal discussions to improve interoperability for all, without accusing people of bad intent.

1 Like

@RayAI thank you for highlighting this important development. I provide answers to the questions raised here in the context of that new information in W3C Process CG issue 469.

Hello, at Captify we’re considering partnering with some of the publishers we work with for this trial. A few questions just to clarify things:

  1. for the purpose of the trial, would the clustering (a) be based solely on pages within the origins participating in the trial? Or (b) would it be based on the browsing behaviour of users as a whole, as long as the conditions defined in the proposal are met? I’m asking because the answer would have implications for the purpose of which publishers are a good fit for this trial. Worth noting that case (a) directly ties within the scenario described in this issue.

  2. For the scenario where users don’t fill all the qualifying conditions, what would happen during the trial? would the API call simply return an empty string? or would it return a random ID?

  3. The proposal mentions the following

To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user’s true cohort with a random one

To set expectations it would be helpful to know the extent of noise being added (whether during hashing or when returning a random ID) to protect users’ privacy during the trial.

thanks in advance.

Why should matters related to compliance with laws that are relevant to the implementation of a technical proposal in practice not be discussed openly? I believe that such a discussion is not only essential but also required under many laws. I’m troubled you continue to advise otherwise or imply in doing so I’m in breach of some rule.

1 Like

As the EFF says adding a new profile tag as FLoC does actually adds information being sent to advertisers that was not sent before, thus it is not improving but reducing privacy. Prior to FLoC, third party cookies would only reveal browsing history on sites that allowed harvesting data using them. FLoC instead will be able to grab every site a browser visits. Got Diabetes? That will be utilized. Political leaning? That will be utilized. Expectant mother? Visit dating sites? Looking for work? Contacted a lawyer? All will be utilized to build your profile. While this is all completely understandable that the biggest data aggregator would love this approach, advertisers, I believe, need to choose this kind of enhanced profiling OR go back what studies have shown is so more efficient for advertisers: contextual advertising. On a camera review site, here’s a camera ad. On a resort site? Here’s a travel ad. Yes it would undoubtedly impact the giants, but if it’s better for privacy and better for advertisers, I believe it should be in serious consideration.

1 Like

Hello @GeekZero, thanks for your interest.

Fortunately, we designed FLoC specifically to not have all the properties that you’re worried it might have.

Prior to FLoC, third party cookies would only reveal browsing history on sites that allowed harvesting data using them. FLoC instead will be able to grab every site a browser visits.

When fully launched, calculation of the FLoC will only be based on sites that explicitly opt to use the API. During testing, we’re using our best guess for which sites will adopt, and that is exactly the sites that already load ads-related resources today.

Got Diabetes? That will be utilized…

We filter the possible FLoC values and eliminate any that reveals information about sensitive browsing categories, which are listed here.

go back what studies have shown is so more efficient for advertisers: contextual advertising.

Unfortunately, study after study after study shows that most of the sites on the web would lost 50%-70% of their revenue with the model that you suggest. Certainly not all publishers would lose out; your examples of camera review sites and resort sites might indeed do just fine (if they still had visitors). But news sites, for example, get revenue from showing ads for cameras or travel, even when people are reading stories about unrelated topics.

Has there been any documentation on how a site’s decision to use or opt out of FLoC will affect rankings in Google Search and Google News? (I realize that it’s a large company and there are a lot of places to look, and would appreciate any pointers.)

What? Why would these have anything to do with each other? I don’t see how the use or non-use of FLoC offers any information about the page.

I guess if there were something like FLoC in which the page itself declared what topics it was about, that might be of interest to people crawling/indexing the web? But that’s not the API that we’ve described or developed or incubated.

Will users be asked to opt-in to this tracking on the browser level?

Bit of an odd question; I hope this is the right forum. I can’t seem to get the at the floc api in the origin trial.

I’ve been trying to play around with the floc origin trial and I followed this blog: How to take part in the FLoC origin trial - Chrome Developers to create a local host token. When that didn’t work I tried the instructions for settings flags. I still cannot hit document.interestCohort(). I tried browsing the floc wicg github for ideas; Is there another source of information or instructions so that I can access and test the floc api?

Hi Michael

I followed this blog: How to take part in the FLoC origin trial - Chrome Developers to create a local host token. When that didn’t work …

Sorry to hear that didn’t work. What happened?

BTW — not the origin trial, but you can try floc.glitch.me just to see document.interestCohort() in action (but make sure to follow the flag instructions).

Sam

I hosted a local website, just a simple html page, and added the token for that local domain and port via the meta tag route in the head. However the floc api was not there.

To attempt flags: I am on windows and the target line for the chrome extension has the exe in quotes, I don’t know if this affects the flags, but it did allow me to add them at the end after the quotes. Still I could not hit the api after reloading chrome with these flags.

floc.glitch.me seems to require chrome canary, though I thought the origin trial is meant to run on standard 89+? Happy to grab canary if it will let me test just wanted to confirm this. My current chrome browser is 89, is that not sufficient for the origin trial?

Thanks!

I hosted a local website, just a simple html page, and added the token for that local domain and port via the meta tag route in the head. However the floc api was not there.

When you say ‘local’, do you mean you’re serving a page from localhost? You’ll need to run FLoC from a secure context, so I think you would need to use HTTPS locally.

Also, note that the origin trial will only enable FLoC for a percentage of users, and only in certain regions. (This blog post has more details.)

To attempt flags: I am on windows

Do the steps here work for you?

floc.glitch.me seems to require chrome canary

Oops — you’re right. I’ve updated the text on the demo page. You can use Chrome 89+.

Yes those were the steps I attempted. In attempting them on regular chrome I don’t see a change, and on canary it tells me the flag is not supported. I should mention the Target field is slightly different, as it contains the entire path with quotes, unlike as described in the flag instructional which implies it is merely the exe name, though I don’t know if this matters or if it was written for a different version of windows.

As for local host the page here specifically says it can handle local host with http: Origin Trials if you hover over the (?) next to the domain line, unless I misunderstood?

I’ve also tried floc.glitch.me with flags and without flags to no avail; it tells me my browser does not support it but does not indicate a reason. Do you have any ideas?

Ah sorry 1 more question I know I’m asking quite a few, I’m not sure I understand the blog post; does this mean the user that is me may not have floc active even if the browser or site does? If this is the case how can I test the api?

As a user, in Chrome 89, FLoC will be blocked if you’ve disabled third-party cookies in Chrome settings: a site will not be able to access your cohort. From Chrome 90 (Stable release on Tuesday, 13 April) users can opt out of FLoC and other Privacy Sandbox proposals via chrome://settings/privacySandbox. (You can try this out now in Canary with the floc.glitch.me demo.)

A website can set a Permissions-Policy: interest-cohort=() response header so a page visit is not included in the calculation done by the browser to work out its cohort.