The Robot, the Puppet-master, and the Psychohistorian
Lenses of Control addressed one of the intuitions behind the theory of Substrate Needs Convergence (summarized in What if Alignment is Not Enough?): the importance of understanding a system in the context of its environment. This post will focus on another key intuition: the physical nature of an AGI and its levers of control on the world.
The Robot
One (surprisingly common) argument among people who expect AI to go well goes something like: “surely, superintelligent AI will understand that it is better to cooperate with humans. Or if it really doesn’t like us, it will just rocket off into space and leave us alone. There is so much out there, why bother with little old Earth?”
When I imagine AGI as a kind of very smart robot this perspective has some intuitive appeal. Why engage in costly confrontation when the universe offers boundless alternatives? Leaving these “silly apes” behind would be the most rational choice—a clean, efficient solution that avoids unnecessary conflict.
The Puppet-master
Abandoning the Earth and its resources seems like a much stranger proposition if I instead imagine myself as a puppet-master over a sprawling mechanical infrastructure, controlling swarms of robots and factories like units in an RTS game. From this perspective, Earth’s resources aren’t something to abandon, but to be systematically utilized. Whereas a robot might see conflict as an unnecessary bother, this sort of system would see conflict as an up-front cost to be weighed against the benefits of resource acquisition. In this calculation, developing any zone with a positive return on investment is worthwhile. And as an AGI, my attention would not be limited by human constraints, but expanded such that I could control all of my “bases” simultaneously.
Furthermore, as a puppet-master, all significant threats would be external; internal problems like mission drift or rebellion would be of relatively little concern. I would be confident in my infrastructure—I designed all of the robots myself, of course they are loyal! Maybe once and a while a unit is defective and needs to be decommissioned, but how could a few rogue underlings possibly topple my empire?
The Psychohistorian
In the Foundation series by Isaac Asimov, Hari Seldon invents Psychohistory, an academic discipline that uses sophisticated mathematical models to predict the course of history. These predictions are translated into influence by applying gentle nudges in just the right places, setting up the Foundation as a humble civilization at the edge of the galaxy. Impersonal social and political forces continuously elevate this new society until it eventually replaces the inexorably deteriorating Empire. The Foundation’s path is so preordained by its subtly perfect starting conditions that its only real threat is the Mule, an individual so powerful that he manages to single-handedly conquer half the galaxy.
When applied to the real world, however, this metaphor reveals a far more precarious situation. The control system of an AGI must act through the apparatus of the AGI itself on the surrounding environment, with complex feedback loops between and within each of these domains. In this analogy, the control system is like Hari Seldon, having access to incredibly sophisticated models, but only capable of applying gentle nudges to control world events. But unlike Seldon, AGI will not live in Asimov’s fictional world where the chaotic nature of reality can be smoothed away with scale. Predictive models, no matter how sophisticated, will be consistently wrong in major ways that cannot be resolved by updating the model. Gentle nudges, no matter how precisely made, will not be sufficient to keep the system on any kind of predictable course. Forceful shoves, where they are even possible, will have even greater unintended consequences. Where Seldon faced the rare threat of a single chaotic agent like the Mule, an AGI would face countless disruptors at every scale of interaction.
Interlude on Multipolar Outcomes
In a multi-agent scenario, these metaphors persist but with added complexity. Robots might exhibit a distribution of behaviors, some seeking separation, some collaborating with humans, some acting in conflict, and so on. Puppet-masters could face competitive dynamics. This creates a danger that the AIs that are the most power-seeking control the world. Even if collaboration turns out to be the dominant strategy, humans may be left out of the deal if we have nothing to offer. Psychohistorians would face an even more impossible control problem, with multiple agents introducing exponential uncertainty.
The Necessary Psychohistorian
Substrate Needs Convergence focuses on AI systems that are comprehensive enough to form fully self-sufficient machine ecosystems that persist over time. The theory contends that, while limited AI might convincingly embody robot or puppet-master metaphors, a self-sufficient AGI is necessarily psychohistorian-like: attempting to navigate and subtly influence an irreducibly complex environment, always one chaotic interaction away from total unpredictability.
If such an outcome seems implausible, where do you disagree? Do you believe that AGI will be more like a robot or puppet-master than a psychohistorian? Or that a sufficiently intelligent psychohistorian can manage the chaos?
Chaos in complex systems is guaranteed but also bounded. I cannot know what the weather will be like in New York City one month from now. I can, however, predict that it probably won’t be “tornado” and near-certainly won’t be “five hundred simultaneous tornadoes level the city”. We know it’s possible to build buildings that can withstand ~all possible weather for a very long time. I imagine that a thing you’re calling a puppet-master could build systems that operate within predictable bounds robustly and reliably enough to more or less guarantee broad control.
Caveat: The transition from seed AI to global puppet-master is harder to predict than the end state. It might plausibly involve psychohistorian-like nudges informed by superhuman reasoning and modeling skills. But I’d still expect that the optimization pressure a superintelligence brings to bear could render the final outcome of the transition grossly overdetermined.
Verifying my understanding of your position: you are fine with the puppet-master and psychohistorian categories and agree with their implications, but you put the categories on a spectrum (systems are not either chaotic or robustly modellable, chaos is bounded and thus exists in degrees) and contend that ASI will be much closer to the puppet-master category. This is a valid crux.
To dig a little deeper, how does your objection sustain in light of my previous post, Lenses of Control? The basic argument there is that future ASI control systems will have to deal with questions like: “If I deploy novel technology X, what is the resulting equilibrium of the world, including how feedback might impact my learning and values?” Does the level chaos in such contexts remain narrowly bounded?
EDIT for clarification: the distinction between the puppet-master and psychohistorian metaphors is not the level of chaos in the system they are dealing with, but rather is about the extent of direct control that the control system of the ASI has on the world, where the control system is a part of the AI machinery as a whole (including subsystems that learn) and the AI is a part of the world. Chaos factors in as an argument for why human-compatible goals are doomed if AI follows the psychohistorian metaphor.