This is an unfiltered, unsorted collection of plausible (not likely) scenarios where there are humans alive in 200 years. I have left out “We solve alignment” since it is adequately discussed elsewhere on the website. These non-alignment scenarios form a larger and larger part of the human utility pie as MIRI adds nines to the odds of AGI takeoff being unaligned.
I’ve collected the conditions for these outcomes into two categories:
Epistemic conditions, where the scenario depends on some fact that is either true or false about the world but which we don’t yet know, and
Future events, where the scenario depends on things that haven’t happened yet.
We can influence future events. Notably, we can learn about epistemic conditions, but cannot influence them: they’re either true or not. However, in a scenario where we ‘play to our outs,’ we need to estimate the odds of everything turning out OK in a way we had no influence over, in order to judge how much harm an organization ‘playing to our outs’ should be willing to do.
Nuclear war:
Epistemic conditions: AGI research requires a significant number of GPU Years- it cannot be achieved by an extremely clever and lucky programmer with a generator and a 3090.
Future events: A significant fraction of all nuclear warheads are launched, and all chip fabs and supercomputers are hit. Humanity loses the ability to build computers at a 2022 level of performance, and cannot regain it because fossil fuels are depleted.
This scenario is not palatable. After reading Eliezer Yudkowsky’s “Death With Dignity”, intellectually this seems like by far the most likely scenario where humanity recognizably survives. Emotionally I disagree with that conclusion, and I suspect that emotion is encoding some important heuristics.
Spontaneous morality:
Epistemic conditions: inner goals are sampled randomly when an AGI takes off, and there’s a nonzero chance that “intrinsically wants to keep some number of humans around and happy” makes the cut.
Future events: Conceivably, creating a large number of AGIs at once increases the odds that one of them wants pets. In particular, we suspect that we got morality from living in tribes.
This scenario isn’t necessarily likely. However, it’s worth noting that in our N=1 example of intelligence takeoff, humans spontaneously developed the inner goal of keeping cats, even when this is not needed for our (now discarded) outer goal of familial reproductive fitness.
AGI Impossibility
Epistemic conditions: Creating AGI is impossible for humans
Future events: None (other than the implicit ‘We don’t go extinct through some other means’ condition of all these scenarios)
This scenario is unlikely, and we can’t influence its likelyhood.
Human intelligence takeoff
Epistemic conditions: Human intelligence can be meaninfully augmented; modified neurons or neurons + silicon are a fundamentally more efficient substrate for GI computation than silicon.
Future events: Research is done to augment human intelligence, or multiple AIs take off at once.
I’d put this as one of the more plausible set of scenarios. In particular, if it turns out that neurons are a better substrate than silicon, then:
Instrumental AI might be achieved shortly before general AI; a human lab could use this to modify and improve its own scientists faster than rivals could improve their instrumental AI into GAI, and then execute a pivotal act themselves
In a multiple takeoff scenario, the AI that builds a lab full of grotesquely modified human consultants might win against silicon AIs that try to improve themselves directly: of course, this AI will be forced to solve a hilarious reverse-alignment problem
However, if neurons are not better than silicon, then this scenario is implausible, unless smarter humans gain the capacity to solve alignment, or coordinate to not produce AGI faster than they gain the capacity to produce AGI. In my opinion, the likelyhood of this scenario depends much more on epistemic conditions than future events: the nash equilibrium will be the most powerful substrate, and knowledge of how to make neurons is unlikely to be lost.
What are our outs to play to?
This is an unfiltered, unsorted collection of plausible (not likely) scenarios where there are humans alive in 200 years. I have left out “We solve alignment” since it is adequately discussed elsewhere on the website. These non-alignment scenarios form a larger and larger part of the human utility pie as MIRI adds nines to the odds of AGI takeoff being unaligned.
I’ve collected the conditions for these outcomes into two categories:
Epistemic conditions, where the scenario depends on some fact that is either true or false about the world but which we don’t yet know, and
Future events, where the scenario depends on things that haven’t happened yet.
We can influence future events. Notably, we can learn about epistemic conditions, but cannot influence them: they’re either true or not. However, in a scenario where we ‘play to our outs,’ we need to estimate the odds of everything turning out OK in a way we had no influence over, in order to judge how much harm an organization ‘playing to our outs’ should be willing to do.
Nuclear war:
Epistemic conditions: AGI research requires a significant number of GPU Years- it cannot be achieved by an extremely clever and lucky programmer with a generator and a 3090.
Future events: A significant fraction of all nuclear warheads are launched, and all chip fabs and supercomputers are hit. Humanity loses the ability to build computers at a 2022 level of performance, and cannot regain it because fossil fuels are depleted.
This scenario is not palatable. After reading Eliezer Yudkowsky’s “Death With Dignity”, intellectually this seems like by far the most likely scenario where humanity recognizably survives. Emotionally I disagree with that conclusion, and I suspect that emotion is encoding some important heuristics.
Spontaneous morality:
Epistemic conditions: inner goals are sampled randomly when an AGI takes off, and there’s a nonzero chance that “intrinsically wants to keep some number of humans around and happy” makes the cut.
Future events: Conceivably, creating a large number of AGIs at once increases the odds that one of them wants pets. In particular, we suspect that we got morality from living in tribes.
This scenario isn’t necessarily likely. However, it’s worth noting that in our N=1 example of intelligence takeoff, humans spontaneously developed the inner goal of keeping cats, even when this is not needed for our (now discarded) outer goal of familial reproductive fitness.
AGI Impossibility
Epistemic conditions: Creating AGI is impossible for humans
Future events: None (other than the implicit ‘We don’t go extinct through some other means’ condition of all these scenarios)
This scenario is unlikely, and we can’t influence its likelyhood.
Human intelligence takeoff
Epistemic conditions: Human intelligence can be meaninfully augmented; modified neurons or neurons + silicon are a fundamentally more efficient substrate for GI computation than silicon.
Future events: Research is done to augment human intelligence, or multiple AIs take off at once.
I’d put this as one of the more plausible set of scenarios. In particular, if it turns out that neurons are a better substrate than silicon, then:
Instrumental AI might be achieved shortly before general AI; a human lab could use this to modify and improve its own scientists faster than rivals could improve their instrumental AI into GAI, and then execute a pivotal act themselves
In a multiple takeoff scenario, the AI that builds a lab full of grotesquely modified human consultants might win against silicon AIs that try to improve themselves directly: of course, this AI will be forced to solve a hilarious reverse-alignment problem
However, if neurons are not better than silicon, then this scenario is implausible, unless smarter humans gain the capacity to solve alignment, or coordinate to not produce AGI faster than they gain the capacity to produce AGI. In my opinion, the likelyhood of this scenario depends much more on epistemic conditions than future events: the nash equilibrium will be the most powerful substrate, and knowledge of how to make neurons is unlikely to be lost.
Did I miss any?