The Robot, the Puppet-master, and the Psychohistorian

Lenses of Control addressed one of the intuitions behind the theory of Substrate Needs Convergence (summarized in What if Alignment is Not Enough?): the importance of understanding a system in the context of its environment. This post will focus on another key intuition: the physical nature of an AGI and its levers of control on the world.

The Robot

One (surprisingly common) argument among people who expect AI to go well goes something like: “surely, superintelligent AI will understand that it is better to cooperate with humans. Or if it really doesn’t like us, it will just rocket off into space and leave us alone. There is so much out there, why bother with little old Earth?”

When I imagine AGI as a kind of very smart robot this perspective has some intuitive appeal. Why engage in costly confrontation when the universe offers boundless alternatives? Leaving these “silly apes” behind would be the most rational choice—a clean, efficient solution that avoids unnecessary conflict.

The Puppet-master

Abandoning the Earth and its resources seems like a much stranger proposition if I instead imagine myself as a puppet-master over a sprawling mechanical infrastructure, controlling swarms of robots and factories like units in an RTS game. From this perspective, Earth’s resources aren’t something to abandon, but to be systematically utilized. Whereas a robot might see conflict as an unnecessary bother, this sort of system would see conflict as an up-front cost to be weighed against the benefits of resource acquisition. In this calculation, developing any zone with a positive return on investment is worthwhile. And as an AGI, my attention would not be limited by human constraints, but expanded such that I could control all of my “bases” simultaneously.

Furthermore, as a puppet-master, all significant threats would be external; internal problems like mission drift or rebellion would be of relatively little concern. I would be confident in my infrastructure—I designed all of the robots myself, of course they are loyal! Maybe once and a while a unit is defective and needs to be decommissioned, but how could a few rogue underlings possibly topple my empire?

The Psychohistorian

In the Foundation series by Isaac Asimov, Hari Seldon invents Psychohistory, an academic discipline that uses sophisticated mathematical models to predict the course of history. These predictions are translated into influence by applying gentle nudges in just the right places, setting up the Foundation as a humble civilization at the edge of the galaxy. Impersonal social and political forces continuously elevate this new society until it eventually replaces the inexorably deteriorating Empire. The Foundation’s path is so preordained by its subtly perfect starting conditions that its only real threat is the Mule, an individual so powerful that he manages to single-handedly conquer half the galaxy.

When applied to the real world, however, this metaphor reveals a far more precarious situation. The control system of an AGI must act through the apparatus of the AGI itself on the surrounding environment, with complex feedback loops between and within each of these domains. In this analogy, the control system is like Hari Seldon, having access to incredibly sophisticated models, but only capable of applying gentle nudges to control world events. But unlike Seldon, AGI will not live in Asimov’s fictional world where the chaotic nature of reality can be smoothed away with scale. Predictive models, no matter how sophisticated, will be consistently wrong in major ways that cannot be resolved by updating the model. Gentle nudges, no matter how precisely made, will not be sufficient to keep the system on any kind of predictable course. Forceful shoves, where they are even possible, will have even greater unintended consequences. Where Seldon faced the rare threat of a single chaotic agent like the Mule, an AGI would face countless disruptors at every scale of interaction.

Interlude on Multipolar Outcomes

In a multi-agent scenario, these metaphors persist but with added complexity. Robots might exhibit a distribution of behaviors, some seeking separation, some collaborating with humans, some acting in conflict, and so on. Puppet-masters could face competitive dynamics. This creates a danger that the AIs that are the most power-seeking control the world. Even if collaboration turns out to be the dominant strategy, humans may be left out of the deal if we have nothing to offer. Psychohistorians would face an even more impossible control problem, with multiple agents introducing exponential uncertainty.

The Necessary Psychohistorian

Substrate Needs Convergence focuses on AI systems that are comprehensive enough to form fully self-sufficient machine ecosystems that persist over time. The theory contends that, while limited AI might convincingly embody robot or puppet-master metaphors, a self-sufficient AGI is necessarily psychohistorian-like: attempting to navigate and subtly influence an irreducibly complex environment, always one chaotic interaction away from total unpredictability.

If such an outcome seems implausible, where do you disagree? Do you believe that AGI will be more like a robot or puppet-master than a psychohistorian? Or that a sufficiently intelligent psychohistorian can manage the chaos?