steven0461 comments on steven0461′s Shortform Feed

steven0461 26 May 2021 21:19 UTC
17 points
I’d like to register skepticism of the idea of a “long reflection”. I’d guess any intelligence that knew how to stabilize the world with respect to processes that affect humanity’s reflection about its values in undesirable ways (e.g. existential disasters), without also stabilizing it with respect to processes that affect it in desirable ways, would already understand the value extrapolation problem well enough to take a lot of shortcuts in calculating the final answer compared to doing the experiment in real life. (You might call such a calculation a “Hard Reflection”.)
- steven0461 24 Oct 2021 0:13 UTC
  4 points
  Parent
  Suppose you have an AI powered world stabilization regime. Suppose somebody makes a reasonable moral argument about how humanity’s reflection should proceed, like “it’s unfair for me to have less influence just because I hate posting on Facebook”. Does the world stabilization regime now add a Facebook compensation factor to the set of restrictions it enforces? If it does things like this all the time, doesn’t the long reflection just amount to a stage performance of CEV with human actors? If it doesn’t do things like this all the time, doesn’t that create a serious risk of the long term future being stolen by some undesirable dynamic?
- Vladimir_Nesov 27 May 2021 0:34 UTC
  4 points
  Parent
  Inexorability of AI-enacted events doesn’t intrude on decisions and discoveries of people written in those events. These decisions from the distant future may determine how the world preparing to reflect on them runs from the start.
  - steven0461 2 Jun 2021 0:56 UTC
    4 points
    Parent
    Sorry, I don’t think I understand what you mean. There can still be a process that gets the same answer as the long reflection, but with e.g. less suffering or waste of resources, right?
    - Vladimir_Nesov 9 Jul 2021 13:21 UTC
      4 points
      Parent
      There’s now further clarification in this thread.
    - Vladimir_Nesov 11 Jun 2021 14:57 UTC
      4 points
      Parent
      I’m steelmanning long reflection, as both the source of goals for an AGI, and something that happens to our actual civilization, while resolving the issues that jumped out at you. Sorry if it wasn’t clear from the cryptic summary.
      
      If it’s possible to make an AGI that coexists with our civilization (probably something that’s not fully agentic), it should also be possible to make one that runs our civilization in a simulation while affecting what’s going on in the simulation to a similar extent. If the nature of this simulation is more like that of a story (essay?), written without a plan in mind, but by following where the people written in it lead it, it can be dramatically more computationally efficient to run and to make preliminary predictions about.
      
      The same way that determinism enables free will, so can sufficiently lawful storytelling, provided it’s potentially detailed enough to generate thoughts of people in the simulation. So the decisions of the civilization simulated in a story are going to be determined by thoughts and actions of people living there, yet it’s easy to make reasonable predictions about this in advance, and running the whole thing (probably an ensemble of stories, not a single story) is not that expensive, even if it takes a relatively long time, much more than to get excellent predictions of where it leads.
      
      As a result, we quickly get a good approximation of what people will eventually decide, and that can be used to influence the story for the better from the start, without intruding on continuity, or to decide which parts to keep summarized, not letting them become real. So this version of long reflection is basically CEV, but with people inside being real (my guess is that having influence over the outer AGI is a significant component of being real), continuing the course of our own civilization. The outer AGI does whatever based on the eventual decisions of the people within the story, made during the long reflection, assisted within the story according to their own decisions from the future.
      
      Edit: More details in this thread, in particular this comment.
      What links here?
      Vladimir_Nesov's comment on paulfchristiano’s Shortform by paulfchristiano (29 Jun 2021 8:37 UTC; 4 points)
      Vladimir_Nesov's comment on Superintelligent Introspection: A Counter-argument to the Orthogonality Thesis by DirectedEvolution (30 Aug 2021 10:19 UTC; 3 points)
      Vladimir_Nesov's comment on paulfchristiano’s Shortform by paulfchristiano (9 Jul 2021 11:21 UTC; 2 points)
      Vladimir_Nesov's comment on I’m no longer sure that I buy dutch book arguments and this makes me skeptical of the “utility function” abstraction by Eli Tyre (22 Jun 2021 18:49 UTC; 2 points)