A partial archive of discourse.wicg.io as of Saturday February 24, 2024.

[Proposal] Federated Learning of Cohorts (FLoC)

jkarlin
2020-05-18

This is an API to enable interest-based advertising on the web without the need for third-party cookies. With FLoC, companies who today observe the browsing behavior of individuals instead observe the behavior of a cohort (or “flock”) of similar people. This increases user privacy while providing relevant ads to users.

See the explainer for more detailed information.

michaelkleber
2020-10-22

A team of engineers from Google Research and Google Ads have just published a whitepaper, “Evaluation of Cohort Algorithms for the FLoC API”, which evaluates several decentralized approaches to clustering that are compatible with the constraints in the FLoC proposal. They report encouraging results after applying the clustering techniques to two public datasets and to Google Ads proprietary data.

shigeki
2020-12-07

According to the white paper, we are glad to know that the current SimHash technique in FLoC leads to good cohorts.

We (Yahoo! JAPAN) provide targeting ad services to customers and delivers ads based on interest categories. We are very interested in evaluating the cohort’s signals from FLoC and our audience data and working together to make this FLoC technology safe for privacy.

It would be great to move FLoC in the WICG and to conduct trials.

michaelkleber
2020-12-07

Thank you Shigeki Ohtsu, we’re happy to hear of your interest. We’ll get FLoC moved over into WICG for further incubation.

Our first experimental version of FLoC is not yet code-complete in Chrome. So the earliest that an Origin Trial could begin would be in the forthcoming Chrome M89, if we get everything completed by its branch date in January.

yoavweiss
2020-12-07

Great to see this getting industry support! The repo now lives under https://github.com/WICG/floc

Happy incubation!!

jwrosewell
2020-12-14

Is there more support for this proposal than Yahoo Japan?

I don’t think one other company could be considered industry support. Here is a piece I wrote in Adxchanger that more accurately summarises the situation.

Perhaps Yahoo Japan could explain how they’ve been able to mitigate the anti trust issues?

cwilso
2020-12-16

Speaking for myself and in my role as WICG chair, not speaking for my employer.

James, as previously mentioned, the bar for getting IN to incubation at the WICG is intentionally low - that is, there must be SOME industry support. Note that this is actually a HIGHER bar than for the W3C’s Community Group system as a whole, where one merely needs five people (even from the same Member) to start a whole new community group. Incubations are not intended to represent (to counter the point you have espoused repeatedly) a “done deal”; indeed, they are open exploration, so that multiple parties CAN participate in developing solutions.

The bar for starting a Working Group, Interest Group, or moving a specification from a WG to Recommendation status is, of course, much higher.

Stating things like “the overriding feeling among many is that Google has largely ignored the feedback provided by industry and business stakeholders” as if it were fact seems very much like saying “many people are saying”; without data, it’s impossible to objectively find solutions. You don’t like our solutions; great, please participate and try to actually solve the problem of providing marketing tools that do not need to violate users’ privacy. We are actively trying to seek common ground, while you appear to simply be against putting users in control of their own privacy, and attempting to block any attempt to incubation and find solutions that might address the raft of privacy concerns that permeate the Web today. As I have previously stated, blaming Google for the need to find user-privacy-centric solutions to marketing is a red herring; the market (and the other browsers, who have ALREADY made the changes you are attacking Google for) have already spoken on the need to find better user-center solutions.

Your raising of the spectre of anti-trust seems misplaced, as the pointer you yourself provided in your article makes clear the need to have open collaboration - which your attempt to block discussing FLoC or other privacy-focused tools for marketing would, in fact, prevent.

LJWatson
2020-12-16

Perhaps Yahoo Japan could explain how they’ve been able to mitigate the anti trust issues?

Mentioning anti trust at this early stage of incubation seems premature, and as such it is possible that the statement could be perceived as intimidating in nature.

Whether this was the intent or not, please remember that this CG operates under the W3C Code of Ethics and Professional Conduct).

Thanks.

jwrosewell
2020-12-16

You don’t like our solutions; great, please participate and try to actually solve the problem of providing marketing tools that do not need to violate users’ privacy.

See the work of Project Rearc and the IAB ‘Post Third-Party Cookie’ Taskforce where I’m an active participant. At the W3C I edited success criteria for improved web advertising and a questionaire to help proposers understand the wide impact of a proposal. These groups involve the Partnership for Responsible Addressable Media (PRAM) and many other organisations from many sectors providing far greater diversity than I’ve seen in WICG.

We are actively trying to seek common ground, while you appear to simply be against putting users in control of their own privacy, and attempting to block any attempt to incubation and find solutions that might address the raft of privacy concerns that permeate the Web today.

The UK Competition and Market Authority have recommend pseudonymous identifiers. The EPC and PRAM also favour a pseudonymous identifiers. Connected TVs all have pseudonymous identifiers. I’m still reading the EU Digitial Markets Act [1] published yesterday. I find footnote 1 interesting as it reads “Such tracking and profiling of end users online is as such not necessarily an issues, but it is important to ensure that this is done in a controlled and transparent manner, in respect of privacy, data protection and consumer protection.” I agree with this sentence. In the case of the UK I believe my elected parliamentarians should decide what is right for people and society. They have an opportunity to do that via the introduction of the Digital Markets Unit legislation. They must have the chance to make these decisions. That is why I started Markters for an Open Web to ensure they get this opportunity.

Perhaps the question should be why does Google pursue solutions that place more personal information into the hands of a smaller number of US trillion dollar oligopolies and in doing so advance a direction that does not appear to be compatible with the position of regulators and a broad representation of web stakeholders?

In relation to Antitrust I have raised a number of urgent issues in relation to the W3C Process and the Membership Agreement. Of particular significance is the fact that WICG (or any community or business group) cannot create standards. Therefore the antitrust protections afforded the standards process do not apply. Advancing a proposal in this forum that is anti competitive comes with risks and I’m interested to know how these are being mitigated.

[1] https://www.mlex.com/Attachments/2020-12-15_M89U43IH3191RFXG/201215 proposal-regulation-single-market-digital-services-digital-services-act_en.pdf

wseltzer
2020-12-16

Please keep legal allegations out of technical discussion. It’s acceptable to ask how competitors can participate in the technology being discussed and to use these proposal discussions to improve interoperability for all, without accusing people of bad intent.

jwrosewell
2020-12-17

@RayAI thank you for highlighting this important development. I provide answers to the questions raised here in the context of that new information in W3C Process CG issue 469.

Martin_Gruau
2020-12-21

Hello, at Captify we’re considering partnering with some of the publishers we work with for this trial. A few questions just to clarify things:

  1. for the purpose of the trial, would the clustering (a) be based solely on pages within the origins participating in the trial? Or (b) would it be based on the browsing behaviour of users as a whole, as long as the conditions defined in the proposal are met? I’m asking because the answer would have implications for the purpose of which publishers are a good fit for this trial. Worth noting that case (a) directly ties within the scenario described in this issue.

  2. For the scenario where users don’t fill all the qualifying conditions, what would happen during the trial? would the API call simply return an empty string? or would it return a random ID?

  3. The proposal mentions the following

To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user’s true cohort with a random one

To set expectations it would be helpful to know the extent of noise being added (whether during hashing or when returning a random ID) to protect users’ privacy during the trial.

thanks in advance.

jwrosewell
2020-12-23

Why should matters related to compliance with laws that are relevant to the implementation of a technical proposal in practice not be discussed openly? I believe that such a discussion is not only essential but also required under many laws. I’m troubled you continue to advise otherwise or imply in doing so I’m in breach of some rule.

GeekZero
2021-03-12

As the EFF says adding a new profile tag as FLoC does actually adds information being sent to advertisers that was not sent before, thus it is not improving but reducing privacy. Prior to FLoC, third party cookies would only reveal browsing history on sites that allowed harvesting data using them. FLoC instead will be able to grab every site a browser visits. Got Diabetes? That will be utilized. Political leaning? That will be utilized. Expectant mother? Visit dating sites? Looking for work? Contacted a lawyer? All will be utilized to build your profile. While this is all completely understandable that the biggest data aggregator would love this approach, advertisers, I believe, need to choose this kind of enhanced profiling OR go back what studies have shown is so more efficient for advertisers: contextual advertising. On a camera review site, here’s a camera ad. On a resort site? Here’s a travel ad. Yes it would undoubtedly impact the giants, but if it’s better for privacy and better for advertisers, I believe it should be in serious consideration.

michaelkleber
2021-03-12

Hello @GeekZero, thanks for your interest.

Fortunately, we designed FLoC specifically to not have all the properties that you’re worried it might have.

Prior to FLoC, third party cookies would only reveal browsing history on sites that allowed harvesting data using them. FLoC instead will be able to grab every site a browser visits.

When fully launched, calculation of the FLoC will only be based on sites that explicitly opt to use the API. During testing, we’re using our best guess for which sites will adopt, and that is exactly the sites that already load ads-related resources today.

Got Diabetes? That will be utilized…

We filter the possible FLoC values and eliminate any that reveals information about sensitive browsing categories, which are listed here.

go back what studies have shown is so more efficient for advertisers: contextual advertising.

Unfortunately, study after study after study shows that most of the sites on the web would lost 50%-70% of their revenue with the model that you suggest. Certainly not all publishers would lose out; your examples of camera review sites and resort sites might indeed do just fine (if they still had visitors). But news sites, for example, get revenue from showing ads for cameras or travel, even when people are reading stories about unrelated topics.

dmarti
2021-03-17

Has there been any documentation on how a site’s decision to use or opt out of FLoC will affect rankings in Google Search and Google News? (I realize that it’s a large company and there are a lot of places to look, and would appreciate any pointers.)

michaelkleber
2021-03-19

What? Why would these have anything to do with each other? I don’t see how the use or non-use of FLoC offers any information about the page.

I guess if there were something like FLoC in which the page itself declared what topics it was about, that might be of interest to people crawling/indexing the web? But that’s not the API that we’ve described or developed or incubated.

awesomerobot
2021-04-06

Will users be asked to opt-in to this tracking on the browser level?

MichaelLysak
2021-04-06

Bit of an odd question; I hope this is the right forum. I can’t seem to get the at the floc api in the origin trial.

I’ve been trying to play around with the floc origin trial and I followed this blog: How to take part in the FLoC origin trial - Chrome Developers to create a local host token. When that didn’t work I tried the instructions for settings flags. I still cannot hit document.interestCohort(). I tried browsing the floc wicg github for ideas; Is there another source of information or instructions so that I can access and test the floc api?

Sam_Dutton
2021-04-06

Hi Michael

I followed this blog: How to take part in the FLoC origin trial - Chrome Developers to create a local host token. When that didn’t work …

Sorry to hear that didn’t work. What happened?

BTW — not the origin trial, but you can try floc.glitch.me just to see document.interestCohort() in action (but make sure to follow the flag instructions).

Sam