Privacy Sandbox for Web: The Changing Privacy Landscape and Impact to Your Sites
Chrome will be making privacy changes through the Privacy Sandbox initiative throughout 2023 while building new technology to keep user information private. Simultaneously, web publishers and brands are shifting their digital strategy to help preserve ad revenue and valuable marketing analytics that rely on third-party cookies.
This push toward expanded browser privacy is driving more demand for website personalization.
In this session, Google Developer Advocate Sam Dutton walks through the changes that are coming, shares the goals of the Privacy Sandbox initiative, and helps you better understand how you can pivot to ensure you have the data you need to help keep your business and sites moving forward.
SAM DUTTON: Hi, I’m Sam Dutton. I’m a Developer Advocate with the Chrome team based here in London. Thank you so much for joining me today. So three things I’ll do in the next 25 minutes. I’ll give you an overview of the Privacy Sandbox APIs. I’ll explain what you need to do now, and I’ll show you how you can become a tester and join discussion of the APIs and provide feedback.
So let me begin by explaining why we need the Privacy Sandbox. Many of you will know the back story all too well, but it’s worth quickly reiterating why we need this and how we got to where we are today. So the Privacy Sandbox is an initiative to help build a set of privacy preserving APIs to support business models that fund the open web for a future without tracking mechanisms like third party cookies.
Now you might have seen this example from Google I/O. It’s a typical site with components from different sources. And of course, composability is one of the web’s superpowers. You’ve got a map from one origin, some script from another and so on and of course, advertising and whether we like it or not and whatever the future holds, advertising has become a crucial source of revenue and a driver for business on the web.
Now at this point in history, I think browsers and CMSs need to support advertising use cases. So what’s the problem? Well, ad selection, conversion measurement, fraud detection, device customization, lots of other use cases have relied on cross site identity using mechanisms that just weren’t built with privacy in mind.
Now, not just third party cookies, but fingerprinting is used to track user behavior across sites or else sites request personal information, such as email addresses and in addition, third party ecosystems are really complex, especially for advertising. Not even developers, advertisers or publishers understand the supply chain for third party services.
So certainly when I visit a website, I’m not aware of all the third parties involved and what they’re doing with my data and it’s not just me, research shows that people really care about being in control of their data. Privacy concerns increasingly drive choices about what people do online and regulators around the world are stepping up privacy requirements, and this is happening really quickly.
So given how many businesses rely on effective advertising online and how many publishers rely on advertising to monetize their sites, and a whole bunch of other use cases, this is a problem for the whole web ecosystem and not just for tech companies and ad platforms. But of course, because the web is an open platform, proposals for change need buy in and feedback, and browsers such as Chrome cannot and don’t want to act unilaterally.
Browsers are not products for which browser vendors can make decisions in isolation and the reality is that the web was not designed for many of the requirements that are core to the platform today for advertising fraud detection, identity management and all these other requirements use cases and so on. So what we need is purpose built technologies for this privacy focused web, and that’s where the Privacy Sandbox comes in.
So Chrome has been working with the web community alongside industry stakeholders and regulators to develop new privacy preserving technologies that can support a healthy, sustainable ecosystem. Now, once these new purpose built APIs are available, we need to make sure companies have time to adopt them so that we can safely phase out support for third party cookies in Chrome and continue our work to mitigate other types of tracking.
Now, the core set of principles for this initiative is the potential privacy model for the web, and this has been developed by privacy experts and computer scientists at Google. This privacy model lays out a set of ground rules for designing technologies that meet the web platform use cases I’ve talked about, while also abiding by our changing privacy needs.
In particular, the proposal covers the difficult question of how to enable connections across sites without compromising privacy. Now, one of the major innovations of the Privacy Sandbox APIs is to enable the browser to act on the user’s behalf, in a sense getting back to the browsers core role as what we call user agent.
With current technologies, data is collected, aggregated and shared by third parties to track user browsing across sites. Privacy Sandbox APIs can allow ad auctions conversion measurement and these other tasks to be fulfilled by the user’s browser on the user’s device.
So we need to rebuild ad platforms and the web with collaboration across browser vendors, platforms, advertisers, publishers adtech, users, regulators and the privacy community and not least developers like you working with CMS platforms.
So with all that in mind, I just want to give you a whistle stop tour of the Privacy Sandbox APIs themselves. So at Google, this is a shared initiative across the web and Android. Privacy Sandbox on Android is focused on introducing new more private advertising solutions without cross app identifiers.
The web and Android, of course, share the same principles and several of the web proposals are being developed for Android as well. However, of course, the web and Android mobile platforms rely on fundamentally different technologies.
So this is on Android a distinct initiative, but it’s one that those of you building Android apps as well as working on the web, you’ll want to keep an eye on that. So Google has been testing the new APIs In collaboration with a range of partners globally.
Hundreds of companies participating in public forums, either the W3C, they explain the issues on GitHub and so on, publishing perspectives and analysis and joining industry round tables, sharing feedback with Chrome and Android and of course, participating in testing.
Now make no mistake, Privacy Sandbox has a lot of requirements to cover and it is going to be tough along the way. I mean, I think the good news is that the end of all this will have platforms that are safer and more private for users and better for advertisers, publishers, developers and of course, for platforms like WordPress.
So I’m not going to describe all the Privacy Sandbox APIs. Instead I would like to focus on the three main advertising APIs in the Privacy Sandbox. That’s Topics, FLEDGE, and Attribution Reporting. Our Topics and FLEDGE are known as the relevance APIs.
Now Topics provides high level signals of a user’s interest based on their recent browsing history. And Topics can be combined with contextual signals and first party data to select relevant ads.
And FLEDGE supports more granular remarketing and custom audience use cases where marketers want to reach audiences who’ve shown interest in specific websites or products, but of course, to make that possible in a privacy preserving way.
Lastly, Attribution Reporting is Chrome’s proposal for privacy preserving campaign measurement, providing anonymized performance reports and when people view or click on an ad, and then later go on to complete a purchase or some other kind of conversion.
So these APIs have been through a period of testing in Android and in Chrome on desktop and mobile. If you are working with ad tech platforms, you need to make sure you understand those platforms plans to address these use cases and the use cases that are being met by these APIs for this future without third party cookies or other tracking mechanisms.
So now we’ve had a period of technical testing with the APIs activated using Chrome flags, and now in origin trial initially activated for only a small percentage of Chrome users initially. So now we’re at this stage of utility testing, 50% of Chrome Canary dev and beta users have the ads origin trial APIs activated on pages that provide a valid token and 5% of stable users.
Now that’s a small percentage of overall Chrome traffic of course, but it’s enough for limited testing of the APIs with real users. And we’re now moving towards launch in Chrome Stable where the APIs will be available for all users by default and I’ll come back to the timelines for that later.
So just to reiterate, for a single user you can activate the APIs using Chrome flags but for testing at scale, you need to take part in the Privacy Sandbox origin trial, and I’ll share links later for guidance on how to do all that.
So by the way, Chrome is also updating user privacy controls like the UI for this And the Privacy Sandbox controls are actually available as part of the ads APIs origin trial. People will be able to see and manage the interests associated with their browsing or turn off the APIs entirely.
So there are actually three other Privacy Sandbox technologies that I think you may also want to test or certainly flag up to any of your third party providers. Firstly, CHIPS. That’s Cookies Having Independent Partition State allows developers to opt in a cookie to partitioned storage with a separate cookie jar per top level site.
First-Party Sets allows related domain names owned and operated by the same entity to declare themselves as belonging to the same first party, and Private State Tokens. You might have heard of this initial name as Trust Tokens. This is an API to convey a limited amount of information from one browsing context to another for example, across sites to help combat fraud but without using passive tracking techniques.
So first up, let’s take a deeper look at the Topics API. The Topics API provides a mechanism to enable interest based advertising, but without allowing third parties to track user browsing activity. So the API in a sense has three major components, and first up interest based advertising needs a taxonomy of topics of interest.
The Topics API taxonomy looks like this. It’s a publicly maintained human readable list of topics that avoids sensitive topics. And now this is likely to change and develop over time in consultation with the web ecosystem and that means people like you, we need your feedback with this as well as everything else.
So the Topics API needs to infer interests for a user based on their browsing activity, but as I say, to do that in a way that preserves their privacy. So the top topics of interest are recorded for a user in their browser on their device based on their recent browsing activity again, by their browser on their device.
Now at present, Topics does that by using machine learning to map the host names of pages the user visits to Topics from the taxonomy. Now as with the Topics taxonomy itself, that approach will develop over time. But inferring interests from browsing activity needs to get the balance right.
If you have too much detail about user browsing well, that’s bad for privacy, but too little granularity means the API isn’t useful. I think in a sense the main thing to understand here is that topics of interest are just one signal for finding what’s relevant to users.
So now once topics of interest have been inferred by the browser for a user, Topics needs to provide API callers with access to the topics of interest that they’ve observed for the user.
So as the user navigates the web, there are two stages to the API. An API caller, might be an adtech platform for example, calls the API on a page to signal that they want to observe topics for the current page and the current user.
Now later, the API caller can access the topics that they observed for the user. Now all of this must be done without revealing anything more about the user’s browsing activity other than the topics of interest that were observed.
The first way that a Topics API caller can signal to the browser that it has observed topics for a user is to call document.browsingTopics from an iframe embedded on sites the user visits.
Now later the API caller can call the same document.browsingTopics method to access topics that it has observed for the current user. And the reason this method needs an iframe by the way, is that the context for observing Topics must be the same as the context for accessing Topics.
The other way to observe and access Topics is to use fetch, request and response headers. First the API caller needs to make a fetch request to a URL on its origin, including the browsing topics true object in the options parameter.
And if the response to the fetch request includes an Observe-Browsing-Topics ?1 header, well, that signals to the browser that the caller wants the browser to record that the caller has observed the topics of interest for the current user for the current page. I hope that makes sense.
Now topics observed for a user can be retrieved from a caller’s fetch request by accessing the sec-browsing-topics request header. So here’s the whole process from start to finish. I’m conscious of time, so I won’t go through it now, but we will share this later so you can see how that works, the whole process and we’ll have that for each of the APIs.
You can also run the topics co laboratory to test topics inference using the Topics classifier model. Now three major open questions for you before I leave Topics, how could we do a better job of inferring topics of interest for a user based on their browsing activity? How can we improve the taxonomy content and structure to make it more useful while preserving user privacy? And how can we improve the overall architecture of the API?
I think one thing to bear in mind here is whether we have Topics or something different, we still need to meet its use cases. Next up, FLEDGE. So this is an API for on device ad options to serve remarketing and custom audience use cases without the need for cross-site third party tracking.
I think that’s a tiny bit more code detail with FLEDGE because it has a more complicated job to do than Topics. So there are three parts to the FLEDGE process. First, the ad buyer, adds users or rather individual browsers really to what are called interest groups. These are like custom audiences, but interest group membership is stored on the browser on the user’s device.
Now at some point when a user visits a site that displays ads such as a publisher site, an ad seller can initiate an ad auction to select an ad for them, and with FLEDGE this auction can be run on the user’s device.
To select an ad, the auction code runs bidding logic from buyers and auction logic from the seller. And lastly, the browser does post auction reporting to end points supplied by the sellers and buyers.
The configuration object for an interest group might look like this. In this example, the shoe store’s ad tech might have an interest group for remarketing that they’d like to add the user to, and they’ve called this group well, Trail Running Shoes. And the shoe store’s adtech platform calls join ad interest group to ask the user’s browser to join their Trail Running Shoes interest group using that configuration that I just showed you.
Now this auction selects the most appropriate ad given bids for each of the interest groups the user’s browser belongs to along with other factors from the seller and the browser itself.
Now looking at the code, the publisher or a platform selling ad space on the publisher site creates config data for the ad auction. The seller then asks the browser to run an ad auction to select an ad in the browser, and the value returned by run ad auction is passed to an element called a fenced frame so the site can display the winning ad.
Now a fenced frame can be used to display an ad but it cannot interact with the page around it. And then the seller and the winning buyer each have an opportunity to perform logging and reporting, and that’s done by calling navigator.reportresult.
Finally, the user, if all goes well, taps or clicks on the ad and now the Attribution Reporting API takes over. And again, we have a diagram showing the whole process from start to finish, which we’ll share with you after the keynote.
Now, lastly I’d like to tell you a little about the Privacy Sandbox API for ad measurement, which is Attribution Reporting. Attribution Reporting is used to measure when an ad click or an ad impression leads to a conversion. For example, when a view of an ad on a news site leads to a purchase at an online shoe store.
Now as with Topics and FLEDGE, this API is designed to avoid cross-site tracking. So the API allows two types of measurement results, event level reports and summary reports. So let me describe briefly how that works.
First let’s take a look at event level reports. So ad links can be configured with attributes that are specific to the Attribution Reporting API and this makes it possible to tally views and clicks with a request on the conversion side.
Now when a user clicks an ad or sees an ad and then converts, the browser generates a report, and in that report the advertising company or ad tech includes two pieces of data. One, any data they want about the ad click or impression and this can be very detailed for example a creative ID, information about the publisher, the timestamp and so on. And second, a small piece of data about the ad conversion.
Now, to protect user privacy, this cannot be too detailed. Later the browser sends that report about– Well, that report with the data I just explained to the ad tech or the advertiser, and that includes a delay to help avoid user tracking.
The report contains two pieces of data, detailed data about the ad click or impression, the event and high level data about the conversion. So this is an event level report. Now let’s take a look at summary reports.
Now the browser API to generate a summary report is similar, but the results and the mechanism are a little different. So again, when a user clicks an ad or sees an ad and then later converts, the browser generates a report, and in that report the advertising company or ad tech can include any data they want about the ad click or impression and any data they want about the ad conversion, but this report is encrypted.
And this is a privacy protection because this report contains detailed data about the conversion and the impression. So the report could be used for cross-site tracking if it wasn’t encrypted. Then later, the browser will send this encrypted report again, with a small delay.
And in this way, an adtech platform will collect many reports from many users and then send all the reports to an aggregation service as it is called and this service will aggregate all these reports, decrypt them, add some noise to protect user privacy and then return the final result and the final result is called a summary report. It contains measurement data for many users.
So that’s attribution measurement. I hope that makes sense. I’ll link to lots more resources to help you understand and test the Privacy Sandbox APIs in more detail. But one last thing I’d like to mention, Privacy Sandcastle.
This is a demo that combines all the main Privacy Sandbox APIs. It’s been built by our team in Tokyo. It’s still very new. But you can get the code from GitHub and run it locally, and it’s designed to help you understand how all these APIs fit together.
Before I finish, I just like to take a recap and look at the timeline for Privacy Sandbox. As you can see, we are approaching the quarter where we will begin shipping the APIs, meaning they will be available by default in Chrome Stable and ready for testing at production scale. Now that’s only a short amount of time on the calendar, and I can see myself. I’m close to time here.
So a few things that I think you need to do right now. Firstly, understand the timelines for web and Android. Make sure you and your third party providers are prepared for the changes that are now imminent. Secondly, audit your sites to understand where they rely on third party cookies and other mechanisms that are being deprecated. We will share links to tools and instructions for how to do that after the event today.
Next, ask your third party providers, such as adtech platforms and so on how they are preparing to meet their core use cases in the absence of third party cookies or other cross-site tracking mechanisms and lastly, test the Privacy Sandbox APIs and provide feedback and ask your third party providers to do the same.
And if they’re not well, ask them why not and let us know what the answer to that question is. So privacysandbox.com provides timelines, FAQs, more information about cross-platform efforts. I’ll share URLs after this event, but you can find a lot of the content I’ve referred to here from the Privacy Sandbox section on developer.chrome.com.
In particular, this has resources that explain how to ask questions for us and to provide feedback, and you can find out more about origin trials on developer.chrome.com. We’ve also created a series of short videos and articles to help explain Chrome concepts like stuff like origin trials, Chrome flags, blink contents, all that stuff.
So thanks for listening. That’s it from me. Like I say, if you do need support, please go to those resources or you can just direct message me SW12 on Twitter. Thanks so much.