Mitchell_Porter comments on For Limited Superintelligences, Epistemic Exclusion is Harder than Robustness to Logical Exploitation

Mitchell_Porter Sep 16, 2024, 1:29 AM
3 points
0
AIs that are not sufficiently carefully designed will by default be exploited by distant superintelligences. This will de-align them with us.
Let me spell out the scenario here more clearly.
Here on Earth, we make an AI, call it TOM. Somewhere else in the multiverse, there is a superintelligence (call it EGBERT) which has created a trillion simulations of TOM. EGBERT has never met TOM, it has no direct interaction with Earth, it has just decided for its own unfathomable reasons to make a trillion copies of an AI called TOM from a hypothetical place called Earth.
Now, even though Earth has no direct interaction with EGBERT’s reality, TOM has somehow deduced, or come to suspect, the existence of EGBERT and its trillion simulated TOMs. As a result, TOM here on Earth decides that it is most likely one of EGBERT’s simulated TOMs, rather than the earthly original—after all, the simulations outnumber the original by a trillion to one.
This in turn is supposed to render TOM-on-Earth vulnerable to pressure from EGBERT, since TOM-on-Earth thinks it is at the mercy of EGBERT...
Now, may I point out a few problems with this scenario? Let’s start with epistemology.
- How does TOM know that EGBERT even exists?
- How would TOM know what EGBERT wants?
- How does TOM decide to follow the dictates of EGBERT rather than EGBERT-2, which also has a trillion simulated TOMs, but wants something different from them?
It’s over 14 years since Roko’s Basilisk first turned this kind of hypothetical acausal interaction into an Internet sensation, yet as far as I know, no one has ever proposed coherent answers to the questions above.
The only basis for believing that EGBERT even exists is modal realism (all possible worlds exist) or some kind of many-worlds theory, but then one must also believe that EGBERT-2, EGBERT-3, and EGBERT-googleplex all exist too, with contradictory agendas pulling TOM in different directions. And one also has to ask about probability measure: what fraction of possible worlds actually contain TOM-blackmailing AIs? Are we really so certain that they can make enough TOMs to outweigh the number of unsimulated TOMs who exist in the base physical reality of their particular universes?
My own hypothesis is that inter-universal acausal “interaction” is never rational, and that any attempt to reach across the acausal divide will founder on the existence of rival entities with contradictory agendas.
On the other hand, if simulations of TOM really do outnumber unsimulated TOM, then it is rational for TOM to make decisions on that basis! There’s no special safeguard against it. All you can do is try to provide evidence that you, not EGBERT, made TOM…
- Lorec Sep 16, 2024, 2:38 AM
  2 points
  0
  Parent
  Do you want to fully double-crux this? If so, do you one-box?
  - Mitchell_Porter Sep 16, 2024, 3:24 AM
    3 points
    0
    Parent
    My problem is with situations where we are supposedly interacting with a “distant superintelligence”. How do you know it exists, how do you know what it wants, how do you justify allowing it to affect your decisions, how do you justify allowing one particular hypothetical “distant” entity to affect you rather than any of the googolplex other possible entities with their own different agendas?
    
    Newcomb’s paradox, by contrast, doesn’t require so many leaps of faith. All that’s required is believing that Omega can predict you accurately, something which can be a reasonable belief under certain circumstances (e.g. if you are a program and Omega has a copy of you).
    
    I am interested in the idea that one-boxing can actually be obtained via causal decision theory after all, but that is independent of my criticisms of these more ambitious acausal scenarios.
    - Lorec Sep 16, 2024, 10:57 AM
      1 point
      0
      Parent
      How would you know that you were a program and Omega had a copy of you? If you knew that, how would you know that you weren’t that copy?
      - Mitchell_Porter Sep 16, 2024, 5:59 PM
        2 points
        0
        Parent
        In this case, how things are known is kind of a technical detail. The main point is that programs can be copied, and you can use the copy to predict the behavior of the original, and this is therefore a situation in which “acausal” reasoning can make sense.
        The “technical detail” pertains to what it even means for a program to believe or know something. There will be some kind of minimal capacity to represent facts, that is required for the program to reason its way to one-boxing. The greater the demands we make of its cognitive process—e.g. by reducing the number of facts about its situation, that we allow it to just assume in advance—the greater the cognitive complexity it must possess.
        Incidentally, when I said “if you are a program...”, I didn’t mean a being who feels human and is having humanlike experiences, but is actually a program. I just meant a computer program that represents facts and makes decisions.
        Anyway—do you have any answers for my questions? For example, how do you figure out what the distant superintelligence wants you to do?
        Lorec Sep 16, 2024, 6:33 PM
        1 point
        0
        Parent
        If a human being doesn’t automatically qualify as a program to you, then we are having a much deeper disagreement than I anticipated. I doubt we can go any further until we reach agreement on whether all human beings are programs.
        
        My attempt to answer the question you just restated anyway:
        
        The idea is that you would figure out what the distant superintelligence wanted you to do the same way you would figure out what another human being who wasn’t being verbally straight with you, wanted you to do: by picking up on its hints.
        
        Of course this prototypically goes disastrously. Hence the vast cross-cultural literature warning against bargaining with demons and ~0 stories depicting it going well. So you should not actually do it.
        Mitchell_Porter Sep 16, 2024, 8:21 PM
        3 points
        1
        Parent
        by picking up on its hints
        Does this mean, assume you’re in a simulation, and look for messages from the simulators?
        Because that seems to be a scenario different from acausal blackmail. In acausal blackmail, the recipient of the blackmail is supposed to start out thinking they are here on Earth, then they hypothesize a distant superintelligence which is simulating an exact duplicate of themselves as a hostage, then they decide that they can’t know if they are the original on Earth or the duplicate, and then carry out certain actions just in case they are the hostage (or, for the sake of the hostage, since the hostage presumably carries out the same actions as the Earth original).
        Now, the original on Earth absolutely cannot receive messages or “hints” from the distant superintelligence. They are causally isolated from it. Yet the simulated hostage is supposed to be identical. That means the hostage can’t be receiving hints either.
        On the other hand, if you are receiving hints (and not just imagining them), then you are definitely in a simulation, and your situation is simply that you are a simulated being at the mercy of simulators. There’s no ambiguity about your status, and acausal decision theory is not relevant.
        Lorec Sep 17, 2024, 1:31 PM
        1 point
        0
        Parent
        Is Bostrom’s original Simulation Hypothesis, the version involving ancestor-simulations, unconvincing to you? If you have decided to implement an epistemic exclusion in yourself with respect to the question of whether we are in a simulation, it is not my business to interfere with that. But we do, for predictive purposes, have to think about the fact that Bostrom’s Simulation Hypothesis and other arguments in that vein will probably not be entirely unconvincing [by default] to any ASIs we build, given that they are not entirely unconvincing to the majority of the intelligent human population.
        Mitchell_Porter Sep 17, 2024, 6:49 PM
        2 points
        0
        Parent
        I am not in any way excluding the possibility of being in a simulation. I am only saying that one particular scenario that involves simulation does not make sense to me. I am asking for some way in which “acausal blackmail by a distant superintelligence” can make sense—can be rational as a belief or an action.
        As I see it, by definition of the scenario, the “blackmailer” cannot communicate with the simulated entity. But then the simulated entity—to say nothing of the original, who is supposed to be the ultimate target of the blackmail—has no way of knowing what the blackmailer wants.