The claim at hand, that we have both read Eliezer repeatedly make[1], is that there is a sufficient level of intelligence and a sufficient power of nanotechnology that within days or weeks a system could design and innocuously build a nanotechnology factory out of simple biological materials that goes on to build either a disease or a cellular-sized drones that would quickly cause an extinction event — perhaps a virus that spreads quickly around the world with a replication rate that allows it to spread globally before any symptoms are found, or a series of diamond-based machines that can enter the bloodstream and explode on a coordinated signal. This is such a situation where no response from human civilization would occur, and the argument that an AI ought to be worried about people with guns and bombs coming for its data centers has no relevance.
Sure, I have also read Eliezer repeatedly make that claim. On the meta level, I don’t think the fact that he has written about this specific scenario fully makes up for the vagueness in his object-level essay above. But I’m also happy to briefly reply on the object level on this particular narrow point:
In short, I interpret Eliezer to be making a mistake by assuming that the world will not adapt to anticipated developments in nanotechnology and AI in order to protect against various attacks that we can easily see coming, prior to the time that AIs will be capable of accomplishing these incredible feats. By the time AIs are capable of developing such advanced molecular nanotech, I think the world will have already been dramatically transformed by prior waves of technologies, many of which by themselves could importantly change the gameboard, and change what it means for humans to have defenses against advanced nanotech to begin with.
As a concrete example, I think it’s fairly plausible that, by the time artificial superintelligences can create fully functional nanobots that are on-par with or better than biological machines, we will have already developed uploading technology that allows humans to literally become non-biological, implying that we can’t be killed by a virus in the first place. This would reduce the viability of using a virus to cause humanity to go extinct, increasing human robustness.
As a more general argument, and by comparison to Eliezer, I think that nanotechnology will probably be developed more incrementally and predictably, rather than suddenly upon the creation of a superintelligent AI, and the technology will be diffused across civilization, rather than existing solely in the hands of a small lab run by an AI. I also think Eliezer seems to be imagining that superintelligent AI will be created in a world that looks broadly similar to our current world, with defensive technologies that are only roughly as powerful as the ones that exist in 2024. However, I don’t think that will be the case.
Given an incremental and diffuse development trajectory, and transformative precursor technologies to mature nanotech, I expect society will have time to make preparations as the technology is developed, allowing us to develop defenses to such dramatic nanotech attacks alongside the offensive nanotechnologies that will also eventually be developed. It therefore seems unlikely to me that society will be completely caught by surprise by fully-developed-molecular nanotechnology, without any effective defenses.
This picture you describe is coherent. But I don’t read you to be claiming to have an argument or evidence that warrants making the assumption of gradualism (“incrementally and predictably”) in terms of the qualitative rate of capabilities gains from investment into AI systems, especially once the AIs are improving themselves. Because we don’t have any such theory of capability gains, it could well be that this picture is totally wrong and there will be great spikes. Uncertainty over the shape of the curve averages out into the expectation of a smooth curve, but our lack of knowledge about the shape is no argument for the true shape being smooth.
Not that many domains of capability look especially smooth. For instance if one is to count the general domains of knowledge, my very rough picture is that the GPTs went from like 10 to 1,000 to 1,100, in that it basically could not talk coherently and usefully about most subjects, and then it could, and then it could do so a bit better and marginal new domains added slowly. My guess is also that the models our civilization creates will go from “being able to automate very few jobs” to “can suddenly automate 100s of different jobs” in that it will go from not being trustworthy or reliable in many key contexts, and then with a single model or a few models in a row over a couple of years it will be able to do so. The next 10x spike on either such graph is not approached “incrementally and predictably”.
The example Eliezer gives of an AI developing nanotechnology in our current world is an example of a broader category of “ways that takeover is trivial given a sufficiently wide differential in capabilities/intelligence”. There are of course many possibilities for how an adversary with a wide differential in capabilities could have a decisive strategic advantage over humanity. Perhaps an AI will study human psychology and persuasion with far more data and statistical power than anything before and learn how to convince anyone to obey it the way a religious devotee relates to their prophet, or perhaps a system will get access to a whole country’s google docs and personal computers and security recording systems and be able to think about all of this in parallel in a way no state actor is able to, and go on to blackmail a whole string of relevant people in order to get control of a lot of explosives or nuclear weapons and use it to blackmail a country to do its bidding.
I repeat the lack of a theory of capability gains with respect to investment (including AI-assisted investment) means that astronomical differentials may be on-track to surprise us, far more than how GPT-2 and GPT-3 surprised most people in terms of being able to actually write at a human level. The nanotech example is an extreme example of how decisively that can play out.
Sure, I have also read Eliezer repeatedly make that claim. On the meta level, I don’t think the fact that he has written about this specific scenario fully makes up for the vagueness in his object-level essay above. But I’m also happy to briefly reply on the object level on this particular narrow point:
In short, I interpret Eliezer to be making a mistake by assuming that the world will not adapt to anticipated developments in nanotechnology and AI in order to protect against various attacks that we can easily see coming, prior to the time that AIs will be capable of accomplishing these incredible feats. By the time AIs are capable of developing such advanced molecular nanotech, I think the world will have already been dramatically transformed by prior waves of technologies, many of which by themselves could importantly change the gameboard, and change what it means for humans to have defenses against advanced nanotech to begin with.
As a concrete example, I think it’s fairly plausible that, by the time artificial superintelligences can create fully functional nanobots that are on-par with or better than biological machines, we will have already developed uploading technology that allows humans to literally become non-biological, implying that we can’t be killed by a virus in the first place. This would reduce the viability of using a virus to cause humanity to go extinct, increasing human robustness.
As a more general argument, and by comparison to Eliezer, I think that nanotechnology will probably be developed more incrementally and predictably, rather than suddenly upon the creation of a superintelligent AI, and the technology will be diffused across civilization, rather than existing solely in the hands of a small lab run by an AI. I also think Eliezer seems to be imagining that superintelligent AI will be created in a world that looks broadly similar to our current world, with defensive technologies that are only roughly as powerful as the ones that exist in 2024. However, I don’t think that will be the case.
Given an incremental and diffuse development trajectory, and transformative precursor technologies to mature nanotech, I expect society will have time to make preparations as the technology is developed, allowing us to develop defenses to such dramatic nanotech attacks alongside the offensive nanotechnologies that will also eventually be developed. It therefore seems unlikely to me that society will be completely caught by surprise by fully-developed-molecular nanotechnology, without any effective defenses.
This picture you describe is coherent. But I don’t read you to be claiming to have an argument or evidence that warrants making the assumption of gradualism (“incrementally and predictably”) in terms of the qualitative rate of capabilities gains from investment into AI systems, especially once the AIs are improving themselves. Because we don’t have any such theory of capability gains, it could well be that this picture is totally wrong and there will be great spikes. Uncertainty over the shape of the curve averages out into the expectation of a smooth curve, but our lack of knowledge about the shape is no argument for the true shape being smooth.
Not that many domains of capability look especially smooth. For instance if one is to count the general domains of knowledge, my very rough picture is that the GPTs went from like 10 to 1,000 to 1,100, in that it basically could not talk coherently and usefully about most subjects, and then it could, and then it could do so a bit better and marginal new domains added slowly. My guess is also that the models our civilization creates will go from “being able to automate very few jobs” to “can suddenly automate 100s of different jobs” in that it will go from not being trustworthy or reliable in many key contexts, and then with a single model or a few models in a row over a couple of years it will be able to do so. The next 10x spike on either such graph is not approached “incrementally and predictably”.
The example Eliezer gives of an AI developing nanotechnology in our current world is an example of a broader category of “ways that takeover is trivial given a sufficiently wide differential in capabilities/intelligence”. There are of course many possibilities for how an adversary with a wide differential in capabilities could have a decisive strategic advantage over humanity. Perhaps an AI will study human psychology and persuasion with far more data and statistical power than anything before and learn how to convince anyone to obey it the way a religious devotee relates to their prophet, or perhaps a system will get access to a whole country’s google docs and personal computers and security recording systems and be able to think about all of this in parallel in a way no state actor is able to, and go on to blackmail a whole string of relevant people in order to get control of a lot of explosives or nuclear weapons and use it to blackmail a country to do its bidding.
I repeat the lack of a theory of capability gains with respect to investment (including AI-assisted investment) means that astronomical differentials may be on-track to surprise us, far more than how GPT-2 and GPT-3 surprised most people in terms of being able to actually write at a human level. The nanotech example is an extreme example of how decisively that can play out.