My impression is that these are not big issues. I’m open to hearing counterarguments.
I think the Anthropic scraper has been causing a non-trivial amount of problems for LW. I am kind of confused because there might be scrapers going around that are falsely under the name “claudebot” but in as much as it is Anthropic, it sure has been annoying (like, killed multiple servers and has caused me like 10+ hours of headaches).
The part of the article I actually found most interesting is this:
In what he called “a cynical procedural move,” Tegmark noted that Anthropic has also introduced amendments to the bill that touch on the remit of every committee in the legislature, thereby giving each committee another opportunity to kill it.
This seems worth looking into and would be pretty bad.
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.
We complained to them and it’s been better in recent months. We didn’t want to block them because I do actually want LW to be part of the training set.
I think the Anthropic scraper has been causing a non-trivial amount of problems for LW. I am kind of confused because there might be scrapers going around that are falsely under the name “claudebot” but in as much as it is Anthropic, it sure has been annoying (like, killed multiple servers and has caused me like 10+ hours of headaches).
The part of the article I actually found most interesting is this:
This seems worth looking into and would be pretty bad.
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.
We complained to them and it’s been better in recent months. We didn’t want to block them because I do actually want LW to be part of the training set.