nuclear weapons have different game theory. if your adversary has one, you want to have one to not be wiped out; once both of you have nukes, you don’t want to use them.
also, people were not aware of real close calls until much later.
with ai, there are economic incentives to develop it further than other labs, but as a result, you risk everyone’s lives for money and also create a race to the bottom where everyone’s lives will be lost.
I think you (or @Adam Scholl) need to argue why people won’t be angry at you if you developed nuclear weapons, in a way which doesn’t sound like “yes, what I built could have killed you, but it has an even higher chance of saving you!”
Otherwise, it’s hard to criticize Anthropic for working on AI capabilities without considering whether their work is a net positive. It’s hard to dismiss the net positive arguments as “idiosyncratic utilitarian BOTEC,” when you accept “net positive” arguments regarding nuclear weapons.
Allegedly, people at Anthropic have compared themselves to Robert Oppenheimer. Maybe they know that one could argue they have blood on their hands, the same way one can argue that about Oppenheimer. But people aren’t “rioting” against Oppenheimer.
I feel it’s more useful to debate whether it is a net positive, since that at least has a small chance of convincing Anthropic or their employees.
My argument isn’t “nuclear weapons have a higher chance of saving you than killing you”. People didn’t know about Oppenheimer when rioting about him could help. And they didn’t watch The Day After until decades later. Nuclear weapons were built to not be used.
With AI, companies don’t build nukes to not use them; they build larger and larger weapons because if your latest nuclear explosion is the largest so far, the universe awards you with gold. The first explosion past some unknown threshold will ignite the atmosphere and kill everyone, but some hope that it’ll instead just award them with infinite gold.
Anthropic could’ve been a force of good. It’s very easy, really: lobby for regulation instead of against it so that no one uses the kind of nukes that might kill everyone.
In a world where Anthropic actually tries to be net-positive, they don’t lobby against regulation and instead try to increase the chance of a moratorium on generally smarter-than-human AI systems until alignment is solved.
We’re not in that world, so I don’t think it makes as much sense to talk about Anthropic’s chances of aligning ASI on first try.
(If regulation solves the problem, it doesn’t matter how much it damaged your business interests (which maybe reduced how much alignment research you were able to do). If you really care first and foremost about getting to aligned AGI, then regulation doesn’t make the problem worse. If you’re lobbying against it, you really need to have a better justification than completely unrelated “if I get to the nuclear banana first, we’re more likely to survive”.)
I’ve just read this post, and it is disturbing what arguments Anthropic made about how the US needs to be ahead of China.
I didn’t really catch up to this news, and I think I know where the anti-Anthropic sentiment is coming from now.
I do think that Anthropic only made those arguments in the context of GPU export controls, and trying to convince the Trump administration to do export controls if nothing else. It’s still very concerning, and could undermine their ability to argue for strong regulation in the future.
That said, I don’t agree with the nuclear weapon explanation.
Suppose Alice and Bob were each building a bomb. Alice’s bomb has a 10% chance of exploding and killing everyone, and a 90% chance of exploding into rainbows and lollipops and curing cancer. Bob’s bomb has a 10% chance of exploding and killing everyone, and a 90% chance of “never being used” and having a bunch of good effects via “game theory.”
I think people with ordinary moral views will not be very angry at Alice, but forgive Bob because “Bob’s bomb was built to not be used.”
I don’t believe the nuclear bomb was truly built to not be used from the point of view of the US gov. I think that was just a lie to manipulate scientists who might otherwise have been unwilling to help.
I don’t think any of the AI builders are anywhere close to “building AI not to be used”. This seems even more clear than with nuclear, since AI has clear beneficial peacetime economically valuable uses.
Regulation does make things worse if you believe the regulation will fail to work as intended for one reason or another. For example, my argument that putting compute limits on training runs (temporarily or permanently) would hasten progress to AGI by focusing research efforts on efficiency and exploring algorithmic improvements.
nuclear weapons have different game theory. if your adversary has one, you want to have one to not be wiped out; once both of you have nukes, you don’t want to use them.
also, people were not aware of real close calls until much later.
with ai, there are economic incentives to develop it further than other labs, but as a result, you risk everyone’s lives for money and also create a race to the bottom where everyone’s lives will be lost.
I think you (or @Adam Scholl) need to argue why people won’t be angry at you if you developed nuclear weapons, in a way which doesn’t sound like “yes, what I built could have killed you, but it has an even higher chance of saving you!”
Otherwise, it’s hard to criticize Anthropic for working on AI capabilities without considering whether their work is a net positive. It’s hard to dismiss the net positive arguments as “idiosyncratic utilitarian BOTEC,” when you accept “net positive” arguments regarding nuclear weapons.
Allegedly, people at Anthropic have compared themselves to Robert Oppenheimer. Maybe they know that one could argue they have blood on their hands, the same way one can argue that about Oppenheimer. But people aren’t “rioting” against Oppenheimer.
I feel it’s more useful to debate whether it is a net positive, since that at least has a small chance of convincing Anthropic or their employees.
My argument isn’t “nuclear weapons have a higher chance of saving you than killing you”. People didn’t know about Oppenheimer when rioting about him could help. And they didn’t watch The Day After until decades later. Nuclear weapons were built to not be used.
With AI, companies don’t build nukes to not use them; they build larger and larger weapons because if your latest nuclear explosion is the largest so far, the universe awards you with gold. The first explosion past some unknown threshold will ignite the atmosphere and kill everyone, but some hope that it’ll instead just award them with infinite gold.
Anthropic could’ve been a force of good. It’s very easy, really: lobby for regulation instead of against it so that no one uses the kind of nukes that might kill everyone.
In a world where Anthropic actually tries to be net-positive, they don’t lobby against regulation and instead try to increase the chance of a moratorium on generally smarter-than-human AI systems until alignment is solved.
We’re not in that world, so I don’t think it makes as much sense to talk about Anthropic’s chances of aligning ASI on first try.
(If regulation solves the problem, it doesn’t matter how much it damaged your business interests (which maybe reduced how much alignment research you were able to do). If you really care first and foremost about getting to aligned AGI, then regulation doesn’t make the problem worse. If you’re lobbying against it, you really need to have a better justification than completely unrelated “if I get to the nuclear banana first, we’re more likely to survive”.)
Hi,
I’ve just read this post, and it is disturbing what arguments Anthropic made about how the US needs to be ahead of China.
I didn’t really catch up to this news, and I think I know where the anti-Anthropic sentiment is coming from now.
I do think that Anthropic only made those arguments in the context of GPU export controls, and trying to convince the Trump administration to do export controls if nothing else. It’s still very concerning, and could undermine their ability to argue for strong regulation in the future.
That said, I don’t agree with the nuclear weapon explanation.
Suppose Alice and Bob were each building a bomb. Alice’s bomb has a 10% chance of exploding and killing everyone, and a 90% chance of exploding into rainbows and lollipops and curing cancer. Bob’s bomb has a 10% chance of exploding and killing everyone, and a 90% chance of “never being used” and having a bunch of good effects via “game theory.”
I think people with ordinary moral views will not be very angry at Alice, but forgive Bob because “Bob’s bomb was built to not be used.”
(Dario’s post did not impact the sentiment of my shortform post.)
I don’t believe the nuclear bomb was truly built to not be used from the point of view of the US gov. I think that was just a lie to manipulate scientists who might otherwise have been unwilling to help.
I don’t think any of the AI builders are anywhere close to “building AI not to be used”. This seems even more clear than with nuclear, since AI has clear beneficial peacetime economically valuable uses.
Regulation does make things worse if you believe the regulation will fail to work as intended for one reason or another. For example, my argument that putting compute limits on training runs (temporarily or permanently) would hasten progress to AGI by focusing research efforts on efficiency and exploring algorithmic improvements.