If a very persuasive agent AGI were to take over the world by persuading humans to do its bidding (e.g. maximize paperclips), this would count as an AI takeover scenario. The boots on the ground, the “muscle,” would be human. And the brains behind the steering wheels and control panels would be human. And even the brains behind the tech R&D, the financial management, etc. -- even they would be human! The world would look very human and it would look like it was just one group of humans conquering the others. Yet it would still be fair to say it was an AI takeover… because the humans are ultimately controlled by, and doing the bidding of, the AGI.
OK, now what if it isn’t an agent AGI at all? What if it’s just a persuasion tool, and the humans (stupidly) used it on themselves, e.g. as a joke they program the tool to persuade people to maximize paperclips, and they test it on themselves, and it works surprisingly well, and in a temporary fit of paperclip-maximization the humans decide to constantly use the tool on themselves & upgrade it, thus avoiding “value drift” away from paperclip-maximization… Then we have a scenario that looks very similar to the first scenario, with a growing group of paperclip-maximizing humans conquering the rest of the world, all under the control of an AI, except that whereas in the first scenario the muscle, steering, and R&D was done by humans rather than AI, in this scenario the “agenty bits” such as planning and strategic understanding are also done by humans! It still counts as an AI takeover, I say, because an AI is making a group of humans conquer the world and reshape it according to inhuman values.
Of course the second scenario is super unrealistic—humans won’t be so stupid as to use their persuasion tools on themselves, right? Well… they probably won’t try to persuade themselves to maximize paperclips, and if they did it probably wouldn’t work because persuasion tools won’t be that effective (at least at first.) But some (many?) humans probably WILL use their persuasion tools on themselves, to persuade themselves to be truer, more faithful, more radical believers in whatever ideology they already subscribe to. Persuasion tools don’t have to be that powerful to have an effect here; even a single-digit-percentage-point effect size on various metrics would have a big impact, I think, on society.
Persuasion tools will take as input a payload—some worldview, some set of statements, some set of goals/values—and then work to create an expanding faction of people who are dogmatically committed to that payload. (The people who are using said tools with said input on themselves.)
I think it’s an understatement to say that the vast majority of people who use persuasion tools on themselves in this manner will be imbibing payloads that aren’t 100% true and good. Mistakes happen; in the past, even the great philosophers were wrong about some things, surely we are all wrong about some things today, even some things we feel very confident are true/good. I’d bet that it’s not merely the vast majority, but literally everyone!
So this situation seems both realistic to me (unfortunately) and also fairly described as a case of AI takeover (though certainly a non-central case. And I don’t care about the terminology we use here, I just think it’s amusing.)
To elaborate on this idea a bit more:
If a very persuasive agent AGI were to take over the world by persuading humans to do its bidding (e.g. maximize paperclips), this would count as an AI takeover scenario. The boots on the ground, the “muscle,” would be human. And the brains behind the steering wheels and control panels would be human. And even the brains behind the tech R&D, the financial management, etc. -- even they would be human! The world would look very human and it would look like it was just one group of humans conquering the others. Yet it would still be fair to say it was an AI takeover… because the humans are ultimately controlled by, and doing the bidding of, the AGI.
OK, now what if it isn’t an agent AGI at all? What if it’s just a persuasion tool, and the humans (stupidly) used it on themselves, e.g. as a joke they program the tool to persuade people to maximize paperclips, and they test it on themselves, and it works surprisingly well, and in a temporary fit of paperclip-maximization the humans decide to constantly use the tool on themselves & upgrade it, thus avoiding “value drift” away from paperclip-maximization… Then we have a scenario that looks very similar to the first scenario, with a growing group of paperclip-maximizing humans conquering the rest of the world, all under the control of an AI, except that whereas in the first scenario the muscle, steering, and R&D was done by humans rather than AI, in this scenario the “agenty bits” such as planning and strategic understanding are also done by humans! It still counts as an AI takeover, I say, because an AI is making a group of humans conquer the world and reshape it according to inhuman values.
Of course the second scenario is super unrealistic—humans won’t be so stupid as to use their persuasion tools on themselves, right? Well… they probably won’t try to persuade themselves to maximize paperclips, and if they did it probably wouldn’t work because persuasion tools won’t be that effective (at least at first.) But some (many?) humans probably WILL use their persuasion tools on themselves, to persuade themselves to be truer, more faithful, more radical believers in whatever ideology they already subscribe to. Persuasion tools don’t have to be that powerful to have an effect here; even a single-digit-percentage-point effect size on various metrics would have a big impact, I think, on society.
Persuasion tools will take as input a payload—some worldview, some set of statements, some set of goals/values—and then work to create an expanding faction of people who are dogmatically committed to that payload. (The people who are using said tools with said input on themselves.)
I think it’s an understatement to say that the vast majority of people who use persuasion tools on themselves in this manner will be imbibing payloads that aren’t 100% true and good. Mistakes happen; in the past, even the great philosophers were wrong about some things, surely we are all wrong about some things today, even some things we feel very confident are true/good. I’d bet that it’s not merely the vast majority, but literally everyone!
So this situation seems both realistic to me (unfortunately) and also fairly described as a case of AI takeover (though certainly a non-central case. And I don’t care about the terminology we use here, I just think it’s amusing.)