Another (very weird) counterpoint: you might not see the “swarm coming” because the annexing of our cosmic endowment might look way stranger than the best strategy human minds can come up with.
I remember a safety researcher once mentioned to me that they didn’t necessarily expect us to be killed, just contained, while superintelligence takes over the universe. The argument being that it might want to preserve its history (ie. us) to study it, instead of permanently destroying it. This is basically as bad as also killing everyone, because we’d still be imprisoned away from our largest possible impact. Similar to the serious component in “global poverty is just a rounding error”.
Now I think if you add that our “imprisonment” might be made barely comfortable (which is quite unlikely, but maybe plausible in some almost-aligned-but-ultimately-terrible uncanny value scenarios), then it’s possible that there’s never a discontinuous horror that we would see striking us; instead we will suddenly be blocked from our cosmic endowment without our knowledge. Things will seem to be going essentially on track. But we never quite seem to get to the future we’ve been waiting for.
It would be a steganographic takeoff.
Here’s a (only slightly) more fleshed out argument:
If
deception is something that “blocks induction on it” (eg. you can’t run a million tests on deceptive optimizers and hope for the pattern on the tests to continue), and if
all our “deductions” are really just an assertion of induction at higher levels of abstraction (eg. asserting that Logic will continue to hold)
then deception could look “steganographic” when it’s done at really meta levels, exploiting our more basic metaphysical mistakes.
Interesting stuff. And I agree. Once you have a nanosystem or something of equivalent power, humans are no longer any threat. But we’re yet to be sure if such thing is physically possible. I know many here think so, but I still have my doubts.
Maybe it’s even more likely that some random narrow AI failure will start big wars before anything more fancy. Although with the scaling hypothesis on sight, AGI could come suddenly indeed.
“This is basically as bad as also killing everyone, because we’d still be imprisoned away from our largest possible impact.”
Although I quite disagree with this. I’m not a huge supporter of our largest possible impact. I guess it’s naive to attribute any net positive expectation to that when you look at history or at the present. In fact, such outcome (things staying exactly the same forever) would probably be among the most positive ones in the advent of non aligned AI. As long as we could still take care of Earth, like ending factory farming and dictatorships, it really wouldn’t be that bad...
Another (very weird) counterpoint: you might not see the “swarm coming” because the annexing of our cosmic endowment might look way stranger than the best strategy human minds can come up with.
I remember a safety researcher once mentioned to me that they didn’t necessarily expect us to be killed, just contained, while superintelligence takes over the universe. The argument being that it might want to preserve its history (ie. us) to study it, instead of permanently destroying it. This is basically as bad as also killing everyone, because we’d still be imprisoned away from our largest possible impact. Similar to the serious component in “global poverty is just a rounding error”.
Now I think if you add that our “imprisonment” might be made barely comfortable (which is quite unlikely, but maybe plausible in some almost-aligned-but-ultimately-terrible uncanny value scenarios), then it’s possible that there’s never a discontinuous horror that we would see striking us; instead we will suddenly be blocked from our cosmic endowment without our knowledge. Things will seem to be going essentially on track. But we never quite seem to get to the future we’ve been waiting for.
It would be a steganographic takeoff.
Here’s a (only slightly) more fleshed out argument:
If
deception is something that “blocks induction on it” (eg. you can’t run a million tests on deceptive optimizers and hope for the pattern on the tests to continue), and if
all our “deductions” are really just an assertion of induction at higher levels of abstraction (eg. asserting that Logic will continue to hold)
then deception could look “steganographic” when it’s done at really meta levels, exploiting our more basic metaphysical mistakes.
Interesting stuff. And I agree. Once you have a nanosystem or something of equivalent power, humans are no longer any threat. But we’re yet to be sure if such thing is physically possible. I know many here think so, but I still have my doubts.
Maybe it’s even more likely that some random narrow AI failure will start big wars before anything more fancy. Although with the scaling hypothesis on sight, AGI could come suddenly indeed.
“This is basically as bad as also killing everyone, because we’d still be imprisoned away from our largest possible impact.”
Although I quite disagree with this. I’m not a huge supporter of our largest possible impact. I guess it’s naive to attribute any net positive expectation to that when you look at history or at the present. In fact, such outcome (things staying exactly the same forever) would probably be among the most positive ones in the advent of non aligned AI. As long as we could still take care of Earth, like ending factory farming and dictatorships, it really wouldn’t be that bad...