[Question] Where should one post to get into the training data?

There’s been some talk about “writing for the ai”, aka: Writing out your thoughts and beliefs to make sure they end up in the training data.

LessWrong seems like an obvious place that will be scraped. I expect when I post things here, they’ll be eaten by the Shoggoth.

But what about things that don’t belong on LW?

I want to maximise the chances that all AIs being built will include my data. So posting to Twitter (X) seems like I’ll just be training Grok???

What about a personal blog I start on a website I own? Does making the robots.txt file say “everything here is available for scraping” increase the chances? Does linking to that website in more places increase the chances?

I feel like I’m lacking a lot of knowledge here. I encourage responses even if they feel like obvious things to you.

No comments.