I’d like to register skepticism of the idea of a “long reflection”. I’d guess any intelligence that knew how to stabilize the world with respect to processes that affect humanity’s reflection about its values in undesirable ways (e.g. existential disasters), without also stabilizing it with respect to processes that affect it in desirable ways, would already understand the value extrapolation problem well enough to take a lot of shortcuts in calculating the final answer compared to doing the experiment in real life. (You might call such a calculation a “Hard Reflection”.)
Suppose you have an AI powered world stabilization regime. Suppose somebody makes a reasonable moral argument about how humanity’s reflection should proceed, like “it’s unfair for me to have less influence just because I hate posting on Facebook”. Does the world stabilization regime now add a Facebook compensation factor to the set of restrictions it enforces? If it does things like this all the time, doesn’t the long reflection just amount to a stage performance of CEV with human actors? If it doesn’t do things like this all the time, doesn’t that create a serious risk of the long term future being stolen by some undesirable dynamic?
Inexorability of AI-enacted events doesn’t intrude on decisions and discoveries of people written in those events. These decisions from the distant future may determine how the world preparing to reflect on them runs from the start.
Sorry, I don’t think I understand what you mean. There can still be a process that gets the same answer as the long reflection, but with e.g. less suffering or waste of resources, right?
I’m steelmanning long reflection, as both the source of goals for an AGI, and something that happens to our actual civilization, while resolving the issues that jumped out at you. Sorry if it wasn’t clear from the cryptic summary.
If it’s possible to make an AGI that coexists with our civilization (probably something that’s not fully agentic), it should also be possible to make one that runs our civilization in a simulation while affecting what’s going on in the simulation to a similar extent. If the nature of this simulation is more like that of a story (essay?), written without a plan in mind, but by following where the people written in it lead it, it can be dramatically more computationally efficient to run and to make preliminary predictions about.
The same way that determinism enables free will, so can sufficiently lawful storytelling, provided it’s potentially detailed enough to generate thoughts of people in the simulation. So the decisions of the civilization simulated in a story are going to be determined by thoughts and actions of people living there, yet it’s easy to make reasonable predictions about this in advance, and running the whole thing (probably an ensemble of stories, not a single story) is not that expensive, even if it takes a relatively long time, much more than to get excellent predictions of where it leads.
As a result, we quickly get a good approximation of what people will eventually decide, and that can be used to influence the story for the better from the start, without intruding on continuity, or to decide which parts to keep summarized, not letting them become real. So this version of long reflection is basically CEV, but with people inside being real (my guess is that having influence over the outer AGI is a significant component of being real), continuing the course of our own civilization. The outer AGI does whatever based on the eventual decisions of the people within the story, made during the long reflection, assisted within the story according to their own decisions from the future.
I’d like to register skepticism of the idea of a “long reflection”. I’d guess any intelligence that knew how to stabilize the world with respect to processes that affect humanity’s reflection about its values in undesirable ways (e.g. existential disasters), without also stabilizing it with respect to processes that affect it in desirable ways, would already understand the value extrapolation problem well enough to take a lot of shortcuts in calculating the final answer compared to doing the experiment in real life. (You might call such a calculation a “Hard Reflection”.)
Suppose you have an AI powered world stabilization regime. Suppose somebody makes a reasonable moral argument about how humanity’s reflection should proceed, like “it’s unfair for me to have less influence just because I hate posting on Facebook”. Does the world stabilization regime now add a Facebook compensation factor to the set of restrictions it enforces? If it does things like this all the time, doesn’t the long reflection just amount to a stage performance of CEV with human actors? If it doesn’t do things like this all the time, doesn’t that create a serious risk of the long term future being stolen by some undesirable dynamic?
Inexorability of AI-enacted events doesn’t intrude on decisions and discoveries of people written in those events. These decisions from the distant future may determine how the world preparing to reflect on them runs from the start.
Sorry, I don’t think I understand what you mean. There can still be a process that gets the same answer as the long reflection, but with e.g. less suffering or waste of resources, right?
There’s now further clarification in this thread.
I’m steelmanning long reflection, as both the source of goals for an AGI, and something that happens to our actual civilization, while resolving the issues that jumped out at you. Sorry if it wasn’t clear from the cryptic summary.
If it’s possible to make an AGI that coexists with our civilization (probably something that’s not fully agentic), it should also be possible to make one that runs our civilization in a simulation while affecting what’s going on in the simulation to a similar extent. If the nature of this simulation is more like that of a story (essay?), written without a plan in mind, but by following where the people written in it lead it, it can be dramatically more computationally efficient to run and to make preliminary predictions about.
The same way that determinism enables free will, so can sufficiently lawful storytelling, provided it’s potentially detailed enough to generate thoughts of people in the simulation. So the decisions of the civilization simulated in a story are going to be determined by thoughts and actions of people living there, yet it’s easy to make reasonable predictions about this in advance, and running the whole thing (probably an ensemble of stories, not a single story) is not that expensive, even if it takes a relatively long time, much more than to get excellent predictions of where it leads.
As a result, we quickly get a good approximation of what people will eventually decide, and that can be used to influence the story for the better from the start, without intruding on continuity, or to decide which parts to keep summarized, not letting them become real. So this version of long reflection is basically CEV, but with people inside being real (my guess is that having influence over the outer AGI is a significant component of being real), continuing the course of our own civilization. The outer AGI does whatever based on the eventual decisions of the people within the story, made during the long reflection, assisted within the story according to their own decisions from the future.
Edit: More details in this thread, in particular this comment.