Why is the assumption of a unilateral AI unlikely? That’s a very important crux, big if true
This is a crux for me as well. I’ve seen a lot of stuff that assumes that the future looks like a single coherent entity which controls the light cone, but all of the arguments for the “single” part of that description seem to rely on the idea of an intelligence explosion (that is, that there exists some level of intelligence such that the first entity to reach that level will be able to improve its own speed and capability repeatedly such that it ends up much more capable than everything else combined in a very short period of time).
My impression is that the argument is something like the following
John Von Neumann was a real person who existed and had largely standard human hardware, meaning he had a brain which consumed somewhere in the ballpark of 20 watts.
If you can figure out how to run something as smart as von Neumann on 20 watts of power, you can run something like “a society of a million von Neumanns” for something on the order of $1000 / hour, so that gives a lower bound on how much intelligence you can get from a certain amount of power.
The first AI that is able to significantly optimize its own operation a bit will then be able to use its augmented intelligence to rapidly optimize its intelligence further until it hits the bounds of what’s possible. We’ve already established that “the bounds of what’s possible” far exceeds what we think of as “normal” in human terms.
The cost to the AI of significantly improving its own intelligence will be orders of magnitude lower than the initial cost of training an AI of that level of intelligence from scratch (so with modern-day architectures, the loop looks more like “the AI inspects its own weights, figures out what it’s doing, and writes out a much more efficient implementation which does the same thing” and less like “the AI figures out a new architecture or better hyperparameters that cause loss to decrease 10% faster, and then trains up a new version of itself using that knowledge, and that new version does the same thing”).
An intelligence that self-amplifies like this will behave like a single coherent agent, rather than like a bunch of competing agents trying stuff and copying innovations that worked from each other.
I’ve seen justification for (1) and (2), and (3) and (4) seem intuitively likely to me though I don’t think I’ve seen them explicitly argued anywhere recently (and (4) in particular I could see possibly being false if the bitter lesson holds).
But I would definitely appreciate a distillation of (5), because that’s the one that looks most different to me than the things I observe in the world we live in, and the strategy of “build a self-amplifying intelligence which bootstraps itself to far superhuman (and far-super-everything-else-that-exists-at-the-time) capabilities, and then unilaterally does a pivotal act” seems to rely on (5) being true.
This is a crux for me as well. I’ve seen a lot of stuff that assumes that the future looks like a single coherent entity which controls the light cone, but all of the arguments for the “single” part of that description seem to rely on the idea of an intelligence explosion (that is, that there exists some level of intelligence such that the first entity to reach that level will be able to improve its own speed and capability repeatedly such that it ends up much more capable than everything else combined in a very short period of time).
My impression is that the argument is something like the following
John Von Neumann was a real person who existed and had largely standard human hardware, meaning he had a brain which consumed somewhere in the ballpark of 20 watts.
If you can figure out how to run something as smart as von Neumann on 20 watts of power, you can run something like “a society of a million von Neumanns” for something on the order of $1000 / hour, so that gives a lower bound on how much intelligence you can get from a certain amount of power.
The first AI that is able to significantly optimize its own operation a bit will then be able to use its augmented intelligence to rapidly optimize its intelligence further until it hits the bounds of what’s possible. We’ve already established that “the bounds of what’s possible” far exceeds what we think of as “normal” in human terms.
The cost to the AI of significantly improving its own intelligence will be orders of magnitude lower than the initial cost of training an AI of that level of intelligence from scratch (so with modern-day architectures, the loop looks more like “the AI inspects its own weights, figures out what it’s doing, and writes out a much more efficient implementation which does the same thing” and less like “the AI figures out a new architecture or better hyperparameters that cause loss to decrease 10% faster, and then trains up a new version of itself using that knowledge, and that new version does the same thing”).
An intelligence that self-amplifies like this will behave like a single coherent agent, rather than like a bunch of competing agents trying stuff and copying innovations that worked from each other.
I’ve seen justification for (1) and (2), and (3) and (4) seem intuitively likely to me though I don’t think I’ve seen them explicitly argued anywhere recently (and (4) in particular I could see possibly being false if the bitter lesson holds).
But I would definitely appreciate a distillation of (5), because that’s the one that looks most different to me than the things I observe in the world we live in, and the strategy of “build a self-amplifying intelligence which bootstraps itself to far superhuman (and far-super-everything-else-that-exists-at-the-time) capabilities, and then unilaterally does a pivotal act” seems to rely on (5) being true.