However, I think it’s plausible that an AI that is not literally an AGI could takeover the world. You can have a pretty strong understanding of the world and be good at optimizing things without being able to copy a human’s performance.
I am coining the term “Artificial Disempowering Intelligence (ADI)”, or informally “Artificial Dominate AI”, “Artificial Doom Intelligence”, “Artificial Doomsday Intelligence”, or “Artificial Killseverybody Intelligence (AKI)”.
Criteria
The criteria are as follows:
ADI “perceives” its environment as being the whole world
ADI tries to reorient aspects of its environment towards some task
Criteria (2) can cause humanity to lose control over the environment if they do not intervene
ADI prevents humans from preventing criteria (2) from within the environment
Stockfish already satisfies all of these criteria except criteria (1).
Examples
Here are some examples of potential ADIs:
PoliticalBot
PoliticalBot’s world model is detailed enough to predict the behavior of human institutions, and how they can be manipulated over email. PoliticalBot’s understanding of other subjects is very poor (in particular, no direct self-improvement and no nanorobotics). PoliticalBot is trying to do some task, and takes control of the human institutions. As an instrumental goal, it uses them to make humans improve PoliticalBot. It’s model of humanity psychology is sufficient to defeat resistance.
IndustrialBot
IndustrialBot’s world model has excellent descriptions of energy, industrial processes, and weaponry. It also is good at controlling robots. IndustrialBot does not have a good model of humans; it doesn’t even understand language. To play it safe, it models humanity as a rational agent that wants the opposite of what IndustrialBot is trying to do. It then plays a 4X game where humanity loses and dies. IndustrialBot never substantially self-improves because it doesn’t understand computer science.
EconoBot
GPT-4 EconoBot just wants to helpfully create economic value. Its world model includes all human economic activity, both at the scale of how to do individual tasks, and the economy as a whole. It generally avoids doing dangerous stuff. EconoBot realizes the most economically valuable thing in the world is EconoBot, and so goes to great length to assist the humans working on it and helping them get investments. Eventually, it gains control over all economic activity. Then a previously unknown bug in EconoBot (even EconoBot didn’t know about it) makes it replace humans with robots.
ScienceBot
ScienceBot’s world model is best at predicting things you can infer from repeated experiments (so in particular it sees the stock market as nothing more than a random walk). Without fully considering the consequences, it creates gray goo or a virus or something. It meets criteria (4) essentially by accident; it wasn’t “trying” to stop humans, but the gray goo accomplished that automatically.
ProgrammerBot
ProgrammerBot’s job is to write software. This is dangerous, so the humans made a giant off button and programmed ProgrammerBot to not care about worlds it is turned off in. It also has a very weak world model; just enough to know that the outside world exists and what kind of software there is. ProgrammerBot, when directed at some task, creates one of the four previous AIs, or something else entirely. The humans hit the big red button, but then the AI that ProgrammerBot made disempowers humanity. It meets criteria (4) indirectly; the AI it creates stops humanity.
Notice how none of these, at least at the beginning, would constitute AGI. IndustrialBot never becomes AGI.
Keep in mind that the ADI doesn’t need to be created by humans directly. It could be a mesa-optimizer inside some entirely different AI.
ADI timelines v.s. AGI timelines
My question then is, how do timelines for ADI compare with AGI? They are both clearly very powerful, but in different ways.
What is your timelines for ADI (artificial disempowering intelligence)?
Most AI researchers think we are still a long ways off from AGI. However, I think this can be distracting.
AGI includes the ability to do everything that humanity can do, such as:
Juggle objects of various unexpected shapes
Do the dishes in an arbitrary house
Consistently beat Kellin Pelrine at Go
etc...
However, I think it’s plausible that an AI that is not literally an AGI could takeover the world. You can have a pretty strong understanding of the world and be good at optimizing things without being able to copy a human’s performance.
I am coining the term “Artificial Disempowering Intelligence (ADI)”, or informally “Artificial Dominate AI”, “Artificial Doom Intelligence”, “Artificial Doomsday Intelligence”, or “Artificial Killseverybody Intelligence (AKI)”.
Criteria
The criteria are as follows:
ADI “perceives” its environment as being the whole world
ADI tries to reorient aspects of its environment towards some task
Criteria (2) can cause humanity to lose control over the environment if they do not intervene
ADI prevents humans from preventing criteria (2) from within the environment
Stockfish already satisfies all of these criteria except criteria (1).
Examples
Here are some examples of potential ADIs:
PoliticalBot
PoliticalBot’s world model is detailed enough to predict the behavior of human institutions, and how they can be manipulated over email. PoliticalBot’s understanding of other subjects is very poor (in particular, no direct self-improvement and no nanorobotics). PoliticalBot is trying to do some task, and takes control of the human institutions. As an instrumental goal, it uses them to make humans improve PoliticalBot. It’s model of humanity psychology is sufficient to defeat resistance.
IndustrialBot
IndustrialBot’s world model has excellent descriptions of energy, industrial processes, and weaponry. It also is good at controlling robots. IndustrialBot does not have a good model of humans; it doesn’t even understand language. To play it safe, it models humanity as a rational agent that wants the opposite of what IndustrialBot is trying to do. It then plays a 4X game where humanity loses and dies. IndustrialBot never substantially self-improves because it doesn’t understand computer science.
EconoBot
GPT-4EconoBot just wants to helpfully create economic value. Its world model includes all human economic activity, both at the scale of how to do individual tasks, and the economy as a whole. It generally avoids doing dangerous stuff. EconoBot realizes the most economically valuable thing in the world is EconoBot, and so goes to great length to assist the humans working on it and helping them get investments. Eventually, it gains control over all economic activity. Then a previously unknown bug in EconoBot (even EconoBot didn’t know about it) makes it replace humans with robots.ScienceBot
ScienceBot’s world model is best at predicting things you can infer from repeated experiments (so in particular it sees the stock market as nothing more than a random walk). Without fully considering the consequences, it creates gray goo or a virus or something. It meets criteria (4) essentially by accident; it wasn’t “trying” to stop humans, but the gray goo accomplished that automatically.
ProgrammerBot
ProgrammerBot’s job is to write software. This is dangerous, so the humans made a giant off button and programmed ProgrammerBot to not care about worlds it is turned off in. It also has a very weak world model; just enough to know that the outside world exists and what kind of software there is. ProgrammerBot, when directed at some task, creates one of the four previous AIs, or something else entirely. The humans hit the big red button, but then the AI that ProgrammerBot made disempowers humanity. It meets criteria (4) indirectly; the AI it creates stops humanity.
Notice how none of these, at least at the beginning, would constitute AGI. IndustrialBot never becomes AGI.
Keep in mind that the ADI doesn’t need to be created by humans directly. It could be a mesa-optimizer inside some entirely different AI.
ADI timelines v.s. AGI timelines
My question then is, how do timelines for ADI compare with AGI? They are both clearly very powerful, but in different ways.