Indeed, if you can’t tell spam from content, you may have identified the ‘correct’ definition of the quality you are trying to measure. I think one deviousness of the made-for-adsense content is that it can’t be too informative, otherwise the visitors have no incentive to click on the ads. It balances between informative enough to get the users through but not enough to satisfy them. Normal content is not usually like that. But figuring that out is like judging intent, a task difficult for humans, never mind machines. Would the true definition of quality need to catch even that type of abuse? hmm..
My cynicism leads me to speculate that Google’s ownership of both the adword market and the search market means it may already have the data set it would need to notice people finding a page via search and then moving on to click on the ads because the content didn’t satisfy them.
The “metrics” from the two systems are probably very voluminous and may not be strongly bound to each other (like within session GUIDs to make things really easy) so it wouldn’t be trivial to correlate them in the necessary ways, but it doesn’t strike me as impossible. A simple estimate of the “ad bounce through” (percent of users who click on ads at a site within N seconds of arriving there via search) could probably be developed and added to PageRank as a negative factor if this is not already in the algorithm.
However, despite access to the necessary data set, Google may not have the incentive to do this.
This is a very good thought I hadn’t considered. Thinking about it, on the one hand, I can imagine it easy to circumvent by switching ad providers. On the other hand this would drive many spammers to using alternative ad providers, which would degrade those services so it may be strategically good for Google. Or perhaps by driving spammers and affiliate marketers on to a competitor, it will help them acheive critical mass, something google would like to avoid. Also, using some kind of ‘ad bounce through’ ratio may have unacceptably high false positive ratios, again a bad outcome.
I hope this was not too much rambling, thanks for the interesting perspective.
Indeed, if you can’t tell spam from content, you may have identified the ‘correct’ definition of the quality you are trying to measure. I think one deviousness of the made-for-adsense content is that it can’t be too informative, otherwise the visitors have no incentive to click on the ads. It balances between informative enough to get the users through but not enough to satisfy them. Normal content is not usually like that. But figuring that out is like judging intent, a task difficult for humans, never mind machines. Would the true definition of quality need to catch even that type of abuse? hmm..
My cynicism leads me to speculate that Google’s ownership of both the adword market and the search market means it may already have the data set it would need to notice people finding a page via search and then moving on to click on the ads because the content didn’t satisfy them.
The “metrics” from the two systems are probably very voluminous and may not be strongly bound to each other (like within session GUIDs to make things really easy) so it wouldn’t be trivial to correlate them in the necessary ways, but it doesn’t strike me as impossible. A simple estimate of the “ad bounce through” (percent of users who click on ads at a site within N seconds of arriving there via search) could probably be developed and added to PageRank as a negative factor if this is not already in the algorithm.
However, despite access to the necessary data set, Google may not have the incentive to do this.
This is a very good thought I hadn’t considered. Thinking about it, on the one hand, I can imagine it easy to circumvent by switching ad providers. On the other hand this would drive many spammers to using alternative ad providers, which would degrade those services so it may be strategically good for Google. Or perhaps by driving spammers and affiliate marketers on to a competitor, it will help them acheive critical mass, something google would like to avoid. Also, using some kind of ‘ad bounce through’ ratio may have unacceptably high false positive ratios, again a bad outcome.
I hope this was not too much rambling, thanks for the interesting perspective.