There are two different fraud problems. One problem is about the website selling the ads engaging in fraud while the other is about third parties engaging in fraud.
When it comes to preventing the website selling the ads from engaging in fraud, there are various tools available that involve having trusted third parties. There’s no necessity for solving the trust problem by giving the company that buys the ads any access to consumer data.
This could involve third-party audits or for websites that run on a cloud service a trusted module of that cloud service that’s outside of the control of the website.
If Google and co argue that they are so scummy that no reasonable advertiser is going to trust them even if they pay third-party audits to get more trustworthy, I don’t think their European lobbyists will have much success by arguing that they are so scummy that they need to allow advertisers access to the customer data.
Auditing publisher logs helps with deliberate falsification, but it’s easy to have as fraud that’s plausibly deniable and maybe even not on purpose.
Let’s say you’re a publisher and you want more people to come to your site. You look around and you find someone who says they run a newsletter and would be willing to include links to your stories for a small fee. When you multiply out the cost per visitor this looks like a pretty good deal; you say yes. This traffic turns out to be entirely bots, but you can’t tell because we got rid of ad fraud detection.
Is the newsletter provider allowed to have ads that request something from the website of the advertiser so that the advertiser essentially gets a list of all people who read the newsletter?
For those people that click on the link, what is the advertiser to do to investigate their identities?
The first one is about trusting the person who publishes the newsletter.
The second one is about distinguishing bots from non-bots on your own website. Google ReCapture is a technology that you can use for that. Cloudflare seems to do something similar.
It’s a debate to argue whether Google ReCapture technology itself violates the GDPR and maybe it does, but I could easily see allowing that technology while not allowing 1).
When it comes to privacy and data located in the US, it’s worth mentioning that the US could decide to pass laws that protect the data of EU citizens the same way it does protect US citizen data. As long as the US has a legal framework that’s abusive to EU citizens it makes sense that there’s a cost to pay for that abusive legal framework. Tech companies that complain about paying that cost are free to lobby the US government to pass laws to be less abusive to EU citizens.
Sorry, I think your 1 and 2 are responding to a different scenario than I was trying to describe. Which is partly my fault for introducing something confusing in a relatively terse way. Let me take a step back and describe the scenario in more detail:
You are a publisher: you run a website like cnn.com.
You negotiate with advertisers for space on your site. Perhaps you make a deal with Coke where, based on the amount of traffic you claim to have, they are going to pay you $10,000 for some placement in the month of January.
Traffic comes to your website from a range of places: search engines, social media sharing, etc. Some of this is “paid” traffic, in that you have placed ads or otherwise compensated people for sending traffic your way, and there is also “organic” traffic, which, where no money changes hands. Even though you don’t directly make money from your visitors, as long as you make more money from them seeing your ads while reading articles on your site than it costs for you to bring the visitors in, you come out ahead.
In this scenario, you are primarily a publisher, but you are also an advertiser in that you pay for some of your traffic. My newsletter example was describing a kind of paid traffic you might buy, and what you thought you were buying was a simple “they link to my articles in the newsletter, disclosing that they’re sponsored”.
Today, the incentives in this market are reasonably balanced. If the newsletter is shady they want to inflate their numbers by including some amount of automated traffic, but if they do very much of this the automated traffic will trigger ad fraud detection, run on behalf of your advertisers (Coke, etc.). As a publisher, you really want to avoid this; it’s very painful to get dropped by networks or lose direct deal customers over fraud even if it was only negligence on your part and not intentional abuse.
In a world where there’s no fraud detection system protecting the ultimate advertiser (Coke), however, this feedback loop is broken. In the newsletter can get more and more bold with their inclusion of automated traffic, and Coke doesn’t know if they’re buying real traffic or not.
Running ReCaptcha or other bot detection technology on your website is what I’m talking about here, and is what I thought was more likely prohibited by the GDPR than not when writing the post. (Though in response to Kleber’s comment on Mastodon I’ve updated downward somewhat.)
The problem of inflated ads is currently very real for bigger players, who rely on paid traffic—I’ve worked with a company which did buy large quantities. They were employing several employees to just check and negotiate with the ad-publishers each month about the fraud rates, because the performance (meaning the chosen method—i.e. CPM, CPC, CPL, CPA) were vastly different between the ad-publishers, and it didn’t make sense.
So there definetly was fraud involved, but it was extremely hard (and expensive) to weed fraudulent advertisers out.
Your scenario of an email newsletter is a special case, because it’s virtually impossible to introduce any form of client run code to check for fraud, and can only start your fraud detection after the traffic hit your website.
I think it is questionable whether it’s good for the ecosystem when CNN pays third parties for paid traffics. I would prefer it if CNN focuses on publishing articles that people actually want to read so that CNN doesn’t need to rely on paid traffic.
There are two different fraud problems. One problem is about the website selling the ads engaging in fraud while the other is about third parties engaging in fraud.
When it comes to preventing the website selling the ads from engaging in fraud, there are various tools available that involve having trusted third parties. There’s no necessity for solving the trust problem by giving the company that buys the ads any access to consumer data.
This could involve third-party audits or for websites that run on a cloud service a trusted module of that cloud service that’s outside of the control of the website.
If Google and co argue that they are so scummy that no reasonable advertiser is going to trust them even if they pay third-party audits to get more trustworthy, I don’t think their European lobbyists will have much success by arguing that they are so scummy that they need to allow advertisers access to the customer data.
Auditing publisher logs helps with deliberate falsification, but it’s easy to have as fraud that’s plausibly deniable and maybe even not on purpose.
Let’s say you’re a publisher and you want more people to come to your site. You look around and you find someone who says they run a newsletter and would be willing to include links to your stories for a small fee. When you multiply out the cost per visitor this looks like a pretty good deal; you say yes. This traffic turns out to be entirely bots, but you can’t tell because we got rid of ad fraud detection.
There are two different issues here:
Is the newsletter provider allowed to have ads that request something from the website of the advertiser so that the advertiser essentially gets a list of all people who read the newsletter?
For those people that click on the link, what is the advertiser to do to investigate their identities?
The first one is about trusting the person who publishes the newsletter.
The second one is about distinguishing bots from non-bots on your own website. Google ReCapture is a technology that you can use for that. Cloudflare seems to do something similar.
It’s a debate to argue whether Google ReCapture technology itself violates the GDPR and maybe it does, but I could easily see allowing that technology while not allowing 1).
When it comes to privacy and data located in the US, it’s worth mentioning that the US could decide to pass laws that protect the data of EU citizens the same way it does protect US citizen data. As long as the US has a legal framework that’s abusive to EU citizens it makes sense that there’s a cost to pay for that abusive legal framework. Tech companies that complain about paying that cost are free to lobby the US government to pass laws to be less abusive to EU citizens.
Sorry, I think your 1 and 2 are responding to a different scenario than I was trying to describe. Which is partly my fault for introducing something confusing in a relatively terse way. Let me take a step back and describe the scenario in more detail:
You are a publisher: you run a website like
cnn.com
.You negotiate with advertisers for space on your site. Perhaps you make a deal with Coke where, based on the amount of traffic you claim to have, they are going to pay you $10,000 for some placement in the month of January.
Traffic comes to your website from a range of places: search engines, social media sharing, etc. Some of this is “paid” traffic, in that you have placed ads or otherwise compensated people for sending traffic your way, and there is also “organic” traffic, which, where no money changes hands. Even though you don’t directly make money from your visitors, as long as you make more money from them seeing your ads while reading articles on your site than it costs for you to bring the visitors in, you come out ahead.
In this scenario, you are primarily a publisher, but you are also an advertiser in that you pay for some of your traffic. My newsletter example was describing a kind of paid traffic you might buy, and what you thought you were buying was a simple “they link to my articles in the newsletter, disclosing that they’re sponsored”.
Today, the incentives in this market are reasonably balanced. If the newsletter is shady they want to inflate their numbers by including some amount of automated traffic, but if they do very much of this the automated traffic will trigger ad fraud detection, run on behalf of your advertisers (Coke, etc.). As a publisher, you really want to avoid this; it’s very painful to get dropped by networks or lose direct deal customers over fraud even if it was only negligence on your part and not intentional abuse.
In a world where there’s no fraud detection system protecting the ultimate advertiser (Coke), however, this feedback loop is broken. In the newsletter can get more and more bold with their inclusion of automated traffic, and Coke doesn’t know if they’re buying real traffic or not.
Running ReCaptcha or other bot detection technology on your website is what I’m talking about here, and is what I thought was more likely prohibited by the GDPR than not when writing the post. (Though in response to Kleber’s comment on Mastodon I’ve updated downward somewhat.)
The problem of inflated ads is currently very real for bigger players, who rely on paid traffic—I’ve worked with a company which did buy large quantities. They were employing several employees to just check and negotiate with the ad-publishers each month about the fraud rates, because the performance (meaning the chosen method—i.e. CPM, CPC, CPL, CPA) were vastly different between the ad-publishers, and it didn’t make sense.
So there definetly was fraud involved, but it was extremely hard (and expensive) to weed fraudulent advertisers out.
Your scenario of an email newsletter is a special case, because it’s virtually impossible to introduce any form of client run code to check for fraud, and can only start your fraud detection after the traffic hit your website.
I think it is questionable whether it’s good for the ecosystem when CNN pays third parties for paid traffics. I would prefer it if CNN focuses on publishing articles that people actually want to read so that CNN doesn’t need to rely on paid traffic.