“[Y]ou’ve gestured at nuclear risk. … How many people are allowed to die to prevent AGI?”,
he wrote:
“There should be enough survivors on Earth in close contact to form a viable reproductive population, with room to spare, and they should have a sustainable food supply. So long as that’s true, there’s still a chance of reaching the stars someday.”
He later deleted that tweet because he worried it would be interpreted by some as advocating a nuclear first strike.
I’ve seen no evidence that he is advocating a nuclear first strike, but it does seem to me to be a fair reading of that tweet that he would trade nuclear devastation for preventing AGI.
I’ve seen no evidence that he is advocating a nuclear first strike, but it does seem to me to be a fair reading of that tweet that he would trade nuclear devastation for preventing AGI.
Most nuclear powers are willing to trade nuclear devastation for preventing the other side’s victory. If you went by sheer “number of surviving humans”, your best reaction to seeing the ICBMs fly towards you should be to cross your arms, make your peace, and let them hit without lifting a finger. Less chance of a nuclear winter and extinction that way. But the way deterrence prevents that from happening is by pre-commitment to actually just blowing it all up if someone ever tries something funny. That is hardly less insane than what EY suggests, but it kinda makes sense in context (but still, with a God’s eye view on humanity, it’s insane, and just the best way we could solve our particular coordination problem).
There’s a big difference between pre-committing to X so you have a credible threat against Y, vs. just outright preferring X over Y. In the quoted comment, Eliezer seems to have been doing the latter.
“Most humans die in a nuclear war, but human extinction doesn’t happen” is presumably preferable to “all biological life on Earth is eaten by nanotechnology made by an unaligned AI that has worthless goals”. It should go without saying that both are absolutely terrible outcomes, but one actually is significantly more terrible than the other.
Note that this is literally one of the examples in the OP—discussion of axiology in philosophy.
Right, but of course the absolute, certain implication from “AGI is created” to “all biological life on Earth is eaten by nanotechnology made by an unaligned AI that has worthless goals” requires some amount of justification, and that justification for this level of certainty is completely missing.
In general such confidently made predictions about the technological future have a poor historical track record, and there are multiple holes in the Eliezer/MIRI story, and there is no formal, canonical write up of why they’re so confident in their apparently secret knowledge. There’s a lot of informal, non-canonical, nontechnical stuff like List of Lethalities, security mindset, etc. that’s kind of gesturing at ideas, but there are too many holes and potential objections to have their claimed level of confidence, and they haven’t published anything formal since 2021, and very little since 2017.
We need more than that if we’re going to confidently prefer nuclear devastation over AGI.
The trade-off you’re gesturing at is really risk of AGI vs. risk of nuclear devastation. So you don’t need absolute certainty on either side in order to be willing to make it.
If the former, then I don’t understand your comment and maybe a rewording would help me.
If the latter, then I’ll just reiterate that I’m referring to Eliezer’s explicitly stated willingness to trade off the actuality of (not just some risk of) nuclear devastation to prevent the creation of AGI (though again, to be clear, I am not claiming he advocated a nuclear first strike). The only potential uncertainty in that tradeoff is the consequences of AGI (though I think Eliezer’s been clear that he thinks it means certain doom), and I suppose what follows after nuclear devastation as well.
And how credible would your precommitment be if you made it clear that you actually prefer Y, you’re just saying you’d do X for game theoretical reasons, and you’d do it, swear? These are the murky cognitive waters in which sadly your beliefs (or at least, your performance of them) affects the outcome.
One’s credibility would be less of course, but Eliezer is not the one who would be implementing the hypothetical policy (that would be various governments), so it’s not his credibility that’s relevant here.
I don’t have much sense he’s holding back his real views on the matter.
But on the object level, if you do think that AGI means certain extinction, then that’s indeed the right call (consider also that a single strike on a data centre might mean a risk of nuclear war, but that doesn’t mean it’s a certainty. If one listened to Putin’s barking, every bit of help given to Ukraine is a risk of nuclear war, but in practice Russia just swallows it up and lets it go, because no one is actually very eager to push that button, and they still have way too much to lose from it).
The scenario in which Eliezer’s approach is just wrong is if he is vastly overestimating the risk of an AGI extinction event or takeover. This might be the case, or might become so in the future (for example imagine a society in which the habit is to still enforce the taboo, but alignment has actually advanced enough to make friendly AI feasible). It isn’t perfect, it isn’t necessarily always true, but it isn’t particularly scandalous. I bet you lots of hawkish pundits during the Cold War have said that nuclear annihilation would have been preferable to the worldwide victory of Communism, and that is a substantially more nonsensical view.
I agree that if you’re absolutely certain AGI means the death of everything, then nuclear devastation is preferable.
I think the absolute certainty that AGI does mean the death of everything is extremely far from called for, and is itself a bit scandalous.
(As to whether Eliezer’s policy proposal is likely to lead to nuclear devastation, my bottom line view is it’s too vague to have an opinion. But I think he should have consulted with actual AI policy experts and developed a detailed proposal with them, which he could then point to, before writing up an emotional appeal, with vague references to air strikes and nuclear conflict, for millions of lay people to read in TIME Magazine.)
I think the absolute certainty in general terms would not be warranted; the absolute certainty if AGI is being developed in a reckless manner is more reasonable. Compare someone researching smallpox in a BSL-4 lab versus someone juggling smallpox vials in a huge town square full of people, and what probability does each of them make you assign to a smallpox pandemic being imminent. I still don’t think AGI would mean necessarily doom simply because I don’t fully buy that its ability to scale up to ASI is 100% guaranteed.
However, I also think in practice that would matter little, because states might still see even regular AGI as a major threat. Having infinite cognitive labour is such a broken hax tactic it basically makes you Ruler of the World by default if you have an exclusive over it. That alone might make it a source of tension.
We don’t know with confidence how hard alignment is, and whether something roughly like the current trajectory (even if reckless) leads to certain death if it reaches superintelligence.
There is a wide range of opinion on this subject from smart, well-informed people who have devoted themselves to studying it. We have a lot of blog posts and a small number of technical papers, all usually making important (and sometimes implicit and unexamined) theoretical assumptions which we don’t know are true, plus some empirical analysis of much weaker systems.
We do not have an established, well-tested scientific theory like we do with pathogens such as smallpox. We cannot say with confidence what is going to happen.
Yeah, at the very least it’s calling for billions dead across the world, because once we realize what Eliezer wants, this is the only realistic outcome.
I don’t agree billions dead is the only realistic outcome of his proposal. Plausibly it could just result in actually stopping large training runs. But I think he’s too willing to risk billions dead to achieve that.
In response to the question,
“[Y]ou’ve gestured at nuclear risk. … How many people are allowed to die to prevent AGI?”,
he wrote:
“There should be enough survivors on Earth in close contact to form a viable reproductive population, with room to spare, and they should have a sustainable food supply. So long as that’s true, there’s still a chance of reaching the stars someday.”
He later deleted that tweet because he worried it would be interpreted by some as advocating a nuclear first strike.
I’ve seen no evidence that he is advocating a nuclear first strike, but it does seem to me to be a fair reading of that tweet that he would trade nuclear devastation for preventing AGI.
Most nuclear powers are willing to trade nuclear devastation for preventing the other side’s victory. If you went by sheer “number of surviving humans”, your best reaction to seeing the ICBMs fly towards you should be to cross your arms, make your peace, and let them hit without lifting a finger. Less chance of a nuclear winter and extinction that way. But the way deterrence prevents that from happening is by pre-commitment to actually just blowing it all up if someone ever tries something funny. That is hardly less insane than what EY suggests, but it kinda makes sense in context (but still, with a God’s eye view on humanity, it’s insane, and just the best way we could solve our particular coordination problem).
There’s a big difference between pre-committing to X so you have a credible threat against Y, vs. just outright preferring X over Y. In the quoted comment, Eliezer seems to have been doing the latter.
“Most humans die in a nuclear war, but human extinction doesn’t happen” is presumably preferable to “all biological life on Earth is eaten by nanotechnology made by an unaligned AI that has worthless goals”. It should go without saying that both are absolutely terrible outcomes, but one actually is significantly more terrible than the other.
Note that this is literally one of the examples in the OP—discussion of axiology in philosophy.
Right, but of course the absolute, certain implication from “AGI is created” to “all biological life on Earth is eaten by nanotechnology made by an unaligned AI that has worthless goals” requires some amount of justification, and that justification for this level of certainty is completely missing.
In general such confidently made predictions about the technological future have a poor historical track record, and there are multiple holes in the Eliezer/MIRI story, and there is no formal, canonical write up of why they’re so confident in their apparently secret knowledge. There’s a lot of informal, non-canonical, nontechnical stuff like List of Lethalities, security mindset, etc. that’s kind of gesturing at ideas, but there are too many holes and potential objections to have their claimed level of confidence, and they haven’t published anything formal since 2021, and very little since 2017.
We need more than that if we’re going to confidently prefer nuclear devastation over AGI.
The trade-off you’re gesturing at is really risk of AGI vs. risk of nuclear devastation. So you don’t need absolute certainty on either side in order to be willing to make it.
Did you intend to say risk off, or risk of?
If the former, then I don’t understand your comment and maybe a rewording would help me.
If the latter, then I’ll just reiterate that I’m referring to Eliezer’s explicitly stated willingness to trade off the actuality of (not just some risk of) nuclear devastation to prevent the creation of AGI (though again, to be clear, I am not claiming he advocated a nuclear first strike). The only potential uncertainty in that tradeoff is the consequences of AGI (though I think Eliezer’s been clear that he thinks it means certain doom), and I suppose what follows after nuclear devastation as well.
And how credible would your precommitment be if you made it clear that you actually prefer Y, you’re just saying you’d do X for game theoretical reasons, and you’d do it, swear? These are the murky cognitive waters in which sadly your beliefs (or at least, your performance of them) affects the outcome.
One’s credibility would be less of course, but Eliezer is not the one who would be implementing the hypothetical policy (that would be various governments), so it’s not his credibility that’s relevant here.
I don’t have much sense he’s holding back his real views on the matter.
But on the object level, if you do think that AGI means certain extinction, then that’s indeed the right call (consider also that a single strike on a data centre might mean a risk of nuclear war, but that doesn’t mean it’s a certainty. If one listened to Putin’s barking, every bit of help given to Ukraine is a risk of nuclear war, but in practice Russia just swallows it up and lets it go, because no one is actually very eager to push that button, and they still have way too much to lose from it).
The scenario in which Eliezer’s approach is just wrong is if he is vastly overestimating the risk of an AGI extinction event or takeover. This might be the case, or might become so in the future (for example imagine a society in which the habit is to still enforce the taboo, but alignment has actually advanced enough to make friendly AI feasible). It isn’t perfect, it isn’t necessarily always true, but it isn’t particularly scandalous. I bet you lots of hawkish pundits during the Cold War have said that nuclear annihilation would have been preferable to the worldwide victory of Communism, and that is a substantially more nonsensical view.
I agree that if you’re absolutely certain AGI means the death of everything, then nuclear devastation is preferable.
I think the absolute certainty that AGI does mean the death of everything is extremely far from called for, and is itself a bit scandalous.
(As to whether Eliezer’s policy proposal is likely to lead to nuclear devastation, my bottom line view is it’s too vague to have an opinion. But I think he should have consulted with actual AI policy experts and developed a detailed proposal with them, which he could then point to, before writing up an emotional appeal, with vague references to air strikes and nuclear conflict, for millions of lay people to read in TIME Magazine.)
I think the absolute certainty in general terms would not be warranted; the absolute certainty if AGI is being developed in a reckless manner is more reasonable. Compare someone researching smallpox in a BSL-4 lab versus someone juggling smallpox vials in a huge town square full of people, and what probability does each of them make you assign to a smallpox pandemic being imminent. I still don’t think AGI would mean necessarily doom simply because I don’t fully buy that its ability to scale up to ASI is 100% guaranteed.
However, I also think in practice that would matter little, because states might still see even regular AGI as a major threat. Having infinite cognitive labour is such a broken hax tactic it basically makes you Ruler of the World by default if you have an exclusive over it. That alone might make it a source of tension.
We don’t know with confidence how hard alignment is, and whether something roughly like the current trajectory (even if reckless) leads to certain death if it reaches superintelligence.
There is a wide range of opinion on this subject from smart, well-informed people who have devoted themselves to studying it. We have a lot of blog posts and a small number of technical papers, all usually making important (and sometimes implicit and unexamined) theoretical assumptions which we don’t know are true, plus some empirical analysis of much weaker systems.
We do not have an established, well-tested scientific theory like we do with pathogens such as smallpox. We cannot say with confidence what is going to happen.
Yeah, at the very least it’s calling for billions dead across the world, because once we realize what Eliezer wants, this is the only realistic outcome.
I don’t agree billions dead is the only realistic outcome of his proposal. Plausibly it could just result in actually stopping large training runs. But I think he’s too willing to risk billions dead to achieve that.