[Note to reader: I wrote this comment in response to an earlier and much shorter version of the parent comment.]
Thanks! I’m glad you found it interesting to read.
I considered that counterfactual, but didn’t think it was a good argument. I think there’s a world of difference between a team that has a mechanistic story for how they can prevent the doomsday device from killing everyone, and a team that is merely “looking for such a way” and “occasionally finding little nuggets of insight”. The latter such team is still contributing its efforts and goodwill and endorsement to a project set to end the world in exchange for riches and glory.
I think better consequences will obtain if humans follow the rule of never using safety as the reason for contributing to such an extinction/takeover-causing project unless the bar of “has a mechanistic plan for preventing the project from causing an extinction event” is met.
And, again, I think it’s important in that case to have personal eject-buttons, such as some friends who check in with you to ask things like “Do you still believe in this plan?” and if they think there’s not a good reason, you’ve agreed that any two of them can fire you.
(As a small note of nuance, barring Fermi and one or two others, almost nobody involved had thought of a plausible mechanism by which the atomic bomb would be an extinction risk, so I think there’s less reason to keep to that rule in that case.)
I disagree that the point about the scientists not realizing nuclear weapons would likely become an existential risk changes what I see as the correct choice to make. I think that, knowing that I was a scientist in the USA and I had either the choice to help the US government build nuclear weapons and thus set the world up for a tense, potentially existential, détente between the US’s enemies (Nazis and/or communists and/or others), or… not help. Still seems clearly correct to me to help, since I think a dangerous détente is a better option than only the Nazis or only Stalin having nuclear weapons.
In the current context, I do think there is important strategic game theory overlap with those times, since it seems likely that AI (whether AGI or not) will potentially disrupt the long-standing nuclear détente in the next few years. I expect that whichever government controls the strongest AI in five years from now, if not sooner, will also be nearly immune to long-range missile attacks, conventional military threats, and bioweapons, but able to deploy those things (or a wide range of other coercive technologies) at will against other nations.
Point of clarification: I didn’t mean that there should be a rule against helping one’s country race to develop nukes, the argument I’m making is that humans should have a rule against helping one’s country race to develop nukes that one expects by-default to (say) ignite the atmosphere and kill everyone and for which there is no known countermeasure.
[Note to reader: I wrote this comment in response to an earlier and much shorter version of the parent comment.]
Thanks! I’m glad you found it interesting to read.
I considered that counterfactual, but didn’t think it was a good argument. I think there’s a world of difference between a team that has a mechanistic story for how they can prevent the doomsday device from killing everyone, and a team that is merely “looking for such a way” and “occasionally finding little nuggets of insight”. The latter such team is still contributing its efforts and goodwill and endorsement to a project set to end the world in exchange for riches and glory.
I think better consequences will obtain if humans follow the rule of never using safety as the reason for contributing to such an extinction/takeover-causing project unless the bar of “has a mechanistic plan for preventing the project from causing an extinction event” is met.
And, again, I think it’s important in that case to have personal eject-buttons, such as some friends who check in with you to ask things like “Do you still believe in this plan?” and if they think there’s not a good reason, you’ve agreed that any two of them can fire you.
(As a small note of nuance, barring Fermi and one or two others, almost nobody involved had thought of a plausible mechanism by which the atomic bomb would be an extinction risk, so I think there’s less reason to keep to that rule in that case.)
I disagree that the point about the scientists not realizing nuclear weapons would likely become an existential risk changes what I see as the correct choice to make. I think that, knowing that I was a scientist in the USA and I had either the choice to help the US government build nuclear weapons and thus set the world up for a tense, potentially existential, détente between the US’s enemies (Nazis and/or communists and/or others), or… not help. Still seems clearly correct to me to help, since I think a dangerous détente is a better option than only the Nazis or only Stalin having nuclear weapons.
In the current context, I do think there is important strategic game theory overlap with those times, since it seems likely that AI (whether AGI or not) will potentially disrupt the long-standing nuclear détente in the next few years. I expect that whichever government controls the strongest AI in five years from now, if not sooner, will also be nearly immune to long-range missile attacks, conventional military threats, and bioweapons, but able to deploy those things (or a wide range of other coercive technologies) at will against other nations.
Point of clarification: I didn’t mean that there should be a rule against helping one’s country race to develop nukes, the argument I’m making is that humans should have a rule against helping one’s country race to develop nukes that one expects by-default to (say) ignite the atmosphere and kill everyone and for which there is no known countermeasure.
Ah, yes. Well that certainly makes sense! Thanks for the clarification.
Sorry for editing and extending my comment not knowing that you were already in the process of writing a response to the original!
Thanks! A simple accident, I forgive you :-)