[This comment is just about the “notkilleverybodyism pitch” section.]
we doomers are also unhappy about AI killing all humans because we are human and we don’t want to get murdered by AIs.
I’d like to once again reiterate that the arguments for misaligned AIs killing literally all humans (if they succeed in takeover) are quite weak and probably literally all humans dying conditional on AI takeover is unlikely (<50% likely).
(To be clear, I think there is a substantial chance of at least 1 billion people dying.)
This is due to:
The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). [1]
Decision theory/trade reasons
This is discussed in more detail here and here. (There is also some discussion here.)
I don’t think anyone is being disingenuous here.
FWIW, I’m not totally sure here. It kind of feels to me like Soares and Eliezer are being somewhat disingenuous (or at least have been disingenuous historically). In particular they often talk about literally all humans dying or >50% of humans dying while also if you press them Soares and Eliezer will admit that it’s plausible AIs won’t kill everyone due to the arguments I gave above.
(Here Eliezer says: “I sometimes mention the possibility of being stored and sold to aliens a billion years later, which seems to me to validly incorporate most all the hopes and fears and uncertainties that should properly be involved, without getting into any weirdness that I don’t expect Earthlings to think about validly.” and I’ve heard similar things from Soares. Here Soares notes: “I’m somewhat persuaded by the claim that failing to mention even the possibility of having your brainstate stored, and then run-and-warped by an AI or aliens or whatever later, or run in an alien zoo later, is potentially misleading.” (That said, I think it doesn’t cost that much more to just keep humans physically alive, so this should also be totally plausible IMO.))
Minimally, it seems like Soares just didn’t think that much about the question “But will the AI kill everyone? Exactly how will this go down?” prior to pushing the point “the AI will kill everyone” quite hard. This is a bit concerning because this isn’t at all a crux for Soares and Eliezer (longtermists) but could potentially be a crux for other people. (Whether it is a crux for others depends on the risk of death: some people think that human survival and good living conditions are very likely while also thinking that in the long run full AI control (by AIs which aren’t corrigible or carefully appointed successors) is likely. (To be clear, I don’t particularly think this: an involuntary transition to AI control seems plausible and reasonably likely to be violent.))
It feels to me like it is very linguistically and narratively convenient to say “kill literally everyone” and some accuracy is being abandoned in pursuit of this convenience when people say “kill literally everyone”.
I changed “we doomers are unhappy about AI killing all humans” to “we doomers are unhappy about the possibility of AI killing all humans” for clarity.
If I understand you correctly:
You’re OK with “notkilleveryoneism is the problem we’re working on”
You’re at least willing to engage with claims like “there’s >>90% chance of x-risk” / “there’s >>90% chance of AI takeover” / “there’s >>90% chance of AI extinction or permanent human disempowerment” / etc., even if you disagree with those claims [I disagree with those claims too—“>>90%” is too high for me]
…But here you’re strongly disagreeing with people tying those two things together into “It’s important to work on the notkilleveryoneism problem, because the way things are going, there’s >>90% chance that this problem will happen”
If so, that seems fair enough. For my part, I don’t think I’ve said the third-bullet-point-type thing, but maybe, anyway I’ll try to be careful not to do that in the future.
I think people should generally be careful about using the language “kill literally everyone” or “notkilleverybodyism” insofar as they aren’t confident that misaligned AI would kill literally everyone. (Or haven’t considered counterarguments to this.)
I’m not sure I agree with “I don’t think anyone is being disingenuous here.”
But here you’re strongly disagreeing with people tying those two things together into “It’s important to work on the notkilleveryoneism problem, because the way things are going, there’s >>90% chance that this problem will happen”
I don’t object to people saying “there is a >>90% change that AIs will kill literally every person” or “conditional on AI takeover, I think killing literally every person is likely”. I just want people to actually really think about what they are saying here and at least seriously consider the counterarguments prior to saying this.
Currently, it seem to me like people do actually seriously consider counterarguments to AI takeover but will just say things like “AI will kill literally everyone” without considering counterarguments. (Or not seriously meaning this which also seems unfortunate.)
My core issue is that I think it seems by default misleading to say “notkilleverybodyism” if you think that killing literally everyone is a non-central outcome from misaligned AI takeover.
This is similiar to how it would be misleading to say “I work on Putin not-kill-everybody-in-US-ism in which I try to prevent Putin from killing everyone in the US.” A reasonably interlocutor might say “Ok, but do you expect Putin to kill literally everyone in the US?” And the reasonable response here would be “No, I don’t expect this, thought it is possible if Putin took over the world. Really, I mostly just work on preventing Putin from acquiring more power because I think putin having more power could lead to catastrophic conflict (perhaps killing >10 million people, though probably not killing literally everyone) and bad people having power long term.” I think AI not-kill-everybody-ism is misleading in the same way as “Putin not-kill-everybodyism”.
(Edit: I’m not claiming that the Putin concern is structurally analogous to the AI concern, just that there is a related communication problem.)
I’m not sure I agree with “I don’t think anyone is being disingenuous here.”
Yeah I added a parenthetical to that, linking to your comment above.
I think people should generally be careful about using the language “kill literally everyone” or “notkilleverybodyism” [sic] insofar as they aren’t confident that misaligned AI would kill literally everyone. (Or haven’t considered counterarguments to this.)
I don’t personally use the term “notkilleveryoneism”. I do talk about “extinction risk” sometimes. Your point is well taken that I should be considering whether my estimate of extinction risk is significantly lower than my estimate of x-risk / takeover risk / permanent disempowerment risk / whatever.
I quickly searched my writing and couldn’t immediately find anything that I wanted to change. It seems that when I use the magic word “extinction”, as opposed to “x-risk”, I’m almost always saying something pretty vague, like “there is a serious extinction risk and we should work to reduce it”, rather than giving a numerical probability.
[This comment is just about the “notkilleverybodyism pitch” section.]
I’d like to once again reiterate that the arguments for misaligned AIs killing literally all humans (if they succeed in takeover) are quite weak and probably literally all humans dying conditional on AI takeover is unlikely (<50% likely).
(To be clear, I think there is a substantial chance of at least 1 billion people dying.)
This is due to:
The potential for the AI to be at least a tiny bit “kind” (same as humans probably wouldn’t kill all aliens). [1]
Decision theory/trade reasons
This is discussed in more detail here and here. (There is also some discussion here.)
FWIW, I’m not totally sure here. It kind of feels to me like Soares and Eliezer are being somewhat disingenuous (or at least have been disingenuous historically). In particular they often talk about literally all humans dying or >50% of humans dying while also if you press them Soares and Eliezer will admit that it’s plausible AIs won’t kill everyone due to the arguments I gave above.
(Here Eliezer says: “I sometimes mention the possibility of being stored and sold to aliens a billion years later, which seems to me to validly incorporate most all the hopes and fears and uncertainties that should properly be involved, without getting into any weirdness that I don’t expect Earthlings to think about validly.” and I’ve heard similar things from Soares. Here Soares notes: “I’m somewhat persuaded by the claim that failing to mention even the possibility of having your brainstate stored, and then run-and-warped by an AI or aliens or whatever later, or run in an alien zoo later, is potentially misleading.” (That said, I think it doesn’t cost that much more to just keep humans physically alive, so this should also be totally plausible IMO.))
Minimally, it seems like Soares just didn’t think that much about the question “But will the AI kill everyone? Exactly how will this go down?” prior to pushing the point “the AI will kill everyone” quite hard. This is a bit concerning because this isn’t at all a crux for Soares and Eliezer (longtermists) but could potentially be a crux for other people. (Whether it is a crux for others depends on the risk of death: some people think that human survival and good living conditions are very likely while also thinking that in the long run full AI control (by AIs which aren’t corrigible or carefully appointed successors) is likely. (To be clear, I don’t particularly think this: an involuntary transition to AI control seems plausible and reasonably likely to be violent.))
It feels to me like it is very linguistically and narratively convenient to say “kill literally everyone” and some accuracy is being abandoned in pursuit of this convenience when people say “kill literally everyone”.
This includes the potential for the AI to generally have preferences that are morally valueable from a typical human perspective.
Thanks!
I changed “we doomers are unhappy about AI killing all humans” to “we doomers are unhappy about the possibility of AI killing all humans” for clarity.
If I understand you correctly:
You’re OK with “notkilleveryoneism is the problem we’re working on”
You’re at least willing to engage with claims like “there’s >>90% chance of x-risk” / “there’s >>90% chance of AI takeover” / “there’s >>90% chance of AI extinction or permanent human disempowerment” / etc., even if you disagree with those claims [I disagree with those claims too—“>>90%” is too high for me]
…But here you’re strongly disagreeing with people tying those two things together into “It’s important to work on the notkilleveryoneism problem, because the way things are going, there’s >>90% chance that this problem will happen”
If so, that seems fair enough. For my part, I don’t think I’ve said the third-bullet-point-type thing, but maybe, anyway I’ll try to be careful not to do that in the future.
Hmm, I’d say my disagreements with the post are:
I think people should generally be careful about using the language “kill literally everyone” or “notkilleverybodyism” insofar as they aren’t confident that misaligned AI would kill literally everyone. (Or haven’t considered counterarguments to this.)
I’m not sure I agree with “I don’t think anyone is being disingenuous here.”
I don’t object to people saying “there is a >>90% change that AIs will kill literally every person” or “conditional on AI takeover, I think killing literally every person is likely”. I just want people to actually really think about what they are saying here and at least seriously consider the counterarguments prior to saying this.
Currently, it seem to me like people do actually seriously consider counterarguments to AI takeover but will just say things like “AI will kill literally everyone” without considering counterarguments. (Or not seriously meaning this which also seems unfortunate.)
My core issue is that I think it seems by default misleading to say “notkilleverybodyism” if you think that killing literally everyone is a non-central outcome from misaligned AI takeover.
This is similiar to how it would be misleading to say “I work on Putin not-kill-everybody-in-US-ism in which I try to prevent Putin from killing everyone in the US.” A reasonably interlocutor might say “Ok, but do you expect Putin to kill literally everyone in the US?” And the reasonable response here would be “No, I don’t expect this, thought it is possible if Putin took over the world. Really, I mostly just work on preventing Putin from acquiring more power because I think putin having more power could lead to catastrophic conflict (perhaps killing >10 million people, though probably not killing literally everyone) and bad people having power long term.” I think AI not-kill-everybody-ism is misleading in the same way as “Putin not-kill-everybodyism”.
(Edit: I’m not claiming that the Putin concern is structurally analogous to the AI concern, just that there is a related communication problem.)
(Edit: amusingly, this comms objection is surprisingly relevant today.)
Yeah I added a parenthetical to that, linking to your comment above.
I don’t personally use the term “notkilleveryoneism”. I do talk about “extinction risk” sometimes. Your point is well taken that I should be considering whether my estimate of extinction risk is significantly lower than my estimate of x-risk / takeover risk / permanent disempowerment risk / whatever.
I quickly searched my writing and couldn’t immediately find anything that I wanted to change. It seems that when I use the magic word “extinction”, as opposed to “x-risk”, I’m almost always saying something pretty vague, like “there is a serious extinction risk and we should work to reduce it”, rather than giving a numerical probability.
Seems reasonable, sorry about picking on you in particular for no good reason.