This article makes some fine points but some misleading ones and its thesis is wrong, I think. Bottom line: Anthropic does lots of good things and is doing much better than being maximally selfish/ruthless. (And of course this is possible, contra the article — Anthropic is led by humans who have various beliefs which may entail that they should make tradeoffs in favor of safety. The space of AI companies is clearly not so perfectly competitive that anyone who makes tradeoffs in favor of safety becomes bankrupt and irrelevant.)
My impression is that these are not big issues. I’m open to hearing counterarguments. [Edit: the scraping is likely a substantial issue for many sites; see comment below. (It is not an x-safety issue, of course.)]
Here’s another tension at the heart of AI development: Companies need to hoover up reams and reams of high-quality text from books and websites in order to train their systems. But that text is created by human beings, and human beings generally do not like having their work used without their consent.
I agree this is not ideal-in-all-ways but I’m not aware of a better alternative.
Web publishers and content creators are angry. Matt Barrie, chief executive of Freelancer.com, a platform that connects freelancers with clients, said Anthropic is “the most aggressive scraper by far,” swarming the site even after being told to stop. “We had to block them because they don’t obey the rules of the internet. This is egregious scraping [that] makes the site slower for everyone operating on it and ultimately affects our revenue.”
This is surprising to me. I’m not familiar with the facts. Seems maybe bad.
Deals like these [investments from Amazon and Google] always come with risks. The tech giants want to see a quick return on their investments and maximize profit. To keep them happy, the AI companies may feel pressure to deploy an advanced AI model even if they’re not sure it’s safe.
Yes there’s nonzero force to this phenomenon, but my impression is that Amazon and Google have almost no hard power over Anthropic and no guaranteed access to its models (unlike e.g. how OpenAI may have to share its models with Microsoft, even if OpenAI thinks the model is unsafe), and I’m not aware of a better alternative.
[Edit: mostly I just think this stuff is not-what-you-should-focus-on if evaluating Anthropic on safety — there are much bigger questions.]
There are some things Anthropic should actually do better. There are some ways it’s kinda impure, like training on the internet and taking investments. Being kinda impure is unavoidable if you want to be a frontier AI company. Insofar as Anthropic is much better on safety than other frontier AI companies, I’m glad it exists.
[Edit: I’m slightly annoyed that the piece feels one-sided — it’s not trying to figure out whether Anthropic makes tradeoffs for safety or how it compares to other frontier AI companies, instead it’s collecting things that sound bad. Maybe this is fine since the article’s role is to contribute facts to the discourse, not be the final word.]
My impression is that these are not big issues. I’m open to hearing counterarguments.
I think the Anthropic scraper has been causing a non-trivial amount of problems for LW. I am kind of confused because there might be scrapers going around that are falsely under the name “claudebot” but in as much as it is Anthropic, it sure has been annoying (like, killed multiple servers and has caused me like 10+ hours of headaches).
The part of the article I actually found most interesting is this:
In what he called “a cynical procedural move,” Tegmark noted that Anthropic has also introduced amendments to the bill that touch on the remit of every committee in the legislature, thereby giving each committee another opportunity to kill it.
This seems worth looking into and would be pretty bad.
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.
We complained to them and it’s been better in recent months. We didn’t want to block them because I do actually want LW to be part of the training set.
(Meta: Me posting the article is not an endorsement of the article as a whole. I agree with Zach that lots of sections of it don’t seem fair/balanced and don’t seem to be critical from an extreme risk perspective.
I think the bullet points I listed above summarize the parts that I think are important/relevant.)
This article makes some fine points but some misleading ones and its thesis is wrong, I think. Bottom line: Anthropic does lots of good things and is doing much better than being maximally selfish/ruthless. (And of course this is possible, contra the article — Anthropic is led by humans who have various beliefs which may entail that they should make tradeoffs in favor of safety. The space of AI companies is clearly not so perfectly competitive that anyone who makes tradeoffs in favor of safety becomes bankrupt and irrelevant.)
Yep, Anthropic’s policy advocacy seems bad.
My impression is that these are not big issues. I’m open to hearing counterarguments. [Edit: the scraping is likely a substantial issue for many sites; see comment below. (It is not an x-safety issue, of course.)]
I agree this is not ideal-in-all-ways but I’m not aware of a better alternative.
This is surprising to me. I’m not familiar with the facts. Seems maybe bad.
Yes there’s nonzero force to this phenomenon, but my impression is that Amazon and Google have almost no hard power over Anthropic and no guaranteed access to its models (unlike e.g. how OpenAI may have to share its models with Microsoft, even if OpenAI thinks the model is unsafe), and I’m not aware of a better alternative.
[Edit: mostly I just think this stuff is not-what-you-should-focus-on if evaluating Anthropic on safety — there are much bigger questions.]
There are some things Anthropic should actually do better. There are some ways it’s kinda impure, like training on the internet and taking investments. Being kinda impure is unavoidable if you want to be a frontier AI company. Insofar as Anthropic is much better on safety than other frontier AI companies, I’m glad it exists.
[Edit: I’m slightly annoyed that the piece feels one-sided — it’s not trying to figure out whether Anthropic makes tradeoffs for safety or how it compares to other frontier AI companies, instead it’s collecting things that sound bad. Maybe this is fine since the article’s role is to contribute facts to the discourse, not be the final word.]
I think the Anthropic scraper has been causing a non-trivial amount of problems for LW. I am kind of confused because there might be scrapers going around that are falsely under the name “claudebot” but in as much as it is Anthropic, it sure has been annoying (like, killed multiple servers and has caused me like 10+ hours of headaches).
The part of the article I actually found most interesting is this:
This seems worth looking into and would be pretty bad.
I hope you’ve at least throttled them or IP blocked them temporarily for being annoying. It is not that difficult to scrape a website while respecting its bandwidth and CPU limitations.
We complained to them and it’s been better in recent months. We didn’t want to block them because I do actually want LW to be part of the training set.
+1 to lots of this.
(Meta: Me posting the article is not an endorsement of the article as a whole. I agree with Zach that lots of sections of it don’t seem fair/balanced and don’t seem to be critical from an extreme risk perspective.
I think the bullet points I listed above summarize the parts that I think are important/relevant.)