I guarantee someone’s thinking about this, but I haven’t seen the scammers selling it yet, so I don’t know how transparent or discoverable the scraping/input methods are for LLM source data.
Is there any indication that websites or publishers are modifying their pages/data in ways that give themselves more weight in future GPT accessibility/prediction for related prompts?
[Question] Adversarial (SEO) GPT training data?
I guarantee someone’s thinking about this, but I haven’t seen the scammers selling it yet, so I don’t know how transparent or discoverable the scraping/input methods are for LLM source data.
Is there any indication that websites or publishers are modifying their pages/data in ways that give themselves more weight in future GPT accessibility/prediction for related prompts?