This picture you describe is coherent. But I don’t read you to be claiming to have an argument or evidence that warrants making the assumption of gradualism (“incrementally and predictably”) in terms of the qualitative rate of capabilities gains from investment into AI systems, especially once the AIs are improving themselves. Because we don’t have any such theory of capability gains, it could well be that this picture is totally wrong and there will be great spikes. Uncertainty over the shape of the curve averages out into the expectation of a smooth curve, but our lack of knowledge about the shape is no argument for the true shape being smooth.
Not that many domains of capability look especially smooth. For instance if one is to count the general domains of knowledge, my very rough picture is that the GPTs went from like 10 to 1,000 to 1,100, in that it basically could not talk coherently and usefully about most subjects, and then it could, and then it could do so a bit better and marginal new domains added slowly. My guess is also that the models our civilization creates will go from “being able to automate very few jobs” to “can suddenly automate 100s of different jobs” in that it will go from not being trustworthy or reliable in many key contexts, and then with a single model or a few models in a row over a couple of years it will be able to do so. The next 10x spike on either such graph is not approached “incrementally and predictably”.
The example Eliezer gives of an AI developing nanotechnology in our current world is an example of a broader category of “ways that takeover is trivial given a sufficiently wide differential in capabilities/intelligence”. There are of course many possibilities for how an adversary with a wide differential in capabilities could have a decisive strategic advantage over humanity. Perhaps an AI will study human psychology and persuasion with far more data and statistical power than anything before and learn how to convince anyone to obey it the way a religious devotee relates to their prophet, or perhaps a system will get access to a whole country’s google docs and personal computers and security recording systems and be able to think about all of this in parallel in a way no state actor is able to, and go on to blackmail a whole string of relevant people in order to get control of a lot of explosives or nuclear weapons and use it to blackmail a country to do its bidding.
I repeat the lack of a theory of capability gains with respect to investment (including AI-assisted investment) means that astronomical differentials may be on-track to surprise us, far more than how GPT-2 and GPT-3 surprised most people in terms of being able to actually write at a human level. The nanotech example is an extreme example of how decisively that can play out.
This picture you describe is coherent. But I don’t read you to be claiming to have an argument or evidence that warrants making the assumption of gradualism (“incrementally and predictably”) in terms of the qualitative rate of capabilities gains from investment into AI systems, especially once the AIs are improving themselves. Because we don’t have any such theory of capability gains, it could well be that this picture is totally wrong and there will be great spikes. Uncertainty over the shape of the curve averages out into the expectation of a smooth curve, but our lack of knowledge about the shape is no argument for the true shape being smooth.
Not that many domains of capability look especially smooth. For instance if one is to count the general domains of knowledge, my very rough picture is that the GPTs went from like 10 to 1,000 to 1,100, in that it basically could not talk coherently and usefully about most subjects, and then it could, and then it could do so a bit better and marginal new domains added slowly. My guess is also that the models our civilization creates will go from “being able to automate very few jobs” to “can suddenly automate 100s of different jobs” in that it will go from not being trustworthy or reliable in many key contexts, and then with a single model or a few models in a row over a couple of years it will be able to do so. The next 10x spike on either such graph is not approached “incrementally and predictably”.
The example Eliezer gives of an AI developing nanotechnology in our current world is an example of a broader category of “ways that takeover is trivial given a sufficiently wide differential in capabilities/intelligence”. There are of course many possibilities for how an adversary with a wide differential in capabilities could have a decisive strategic advantage over humanity. Perhaps an AI will study human psychology and persuasion with far more data and statistical power than anything before and learn how to convince anyone to obey it the way a religious devotee relates to their prophet, or perhaps a system will get access to a whole country’s google docs and personal computers and security recording systems and be able to think about all of this in parallel in a way no state actor is able to, and go on to blackmail a whole string of relevant people in order to get control of a lot of explosives or nuclear weapons and use it to blackmail a country to do its bidding.
I repeat the lack of a theory of capability gains with respect to investment (including AI-assisted investment) means that astronomical differentials may be on-track to surprise us, far more than how GPT-2 and GPT-3 surprised most people in terms of being able to actually write at a human level. The nanotech example is an extreme example of how decisively that can play out.