Commenters seem to agree with you here, and I followed the recommendation by removing the code and adding instructions instead.
But I wonder whether this convention means that I can’t use the code to prevent my comment from being added to a corpus. I think it would be better if comments were scraped separately. Does anybody know how the scraping works?
Idk how others do it, but you can see how LW/AF/EAF comments are scraped for the alignment research dataset here (as you can see we don’t check for the uuid)
Commenters seem to agree with you here, and I followed the recommendation by removing the code and adding instructions instead.
But I wonder whether this convention means that I can’t use the code to prevent my comment from being added to a corpus. I think it would be better if comments were scraped separately. Does anybody know how the scraping works?
Idk how others do it, but you can see how LW/AF/EAF comments are scraped for the alignment research dataset here (as you can see we don’t check for the uuid)
Yeah, I guess it is a hopeless endeavor to hide things from web scrapers and by extension GPT-N.