PaulAlmond comments on The AI in a box boxes you

PaulAlmond 13 Nov 2010 10:26 UTC
6 points
It seems to me that most of the argument is about “What if I am a copy?” – and ensuring you don’t get tortured if you are one and “Can the AI actually simulate me?” I suggest that we can make the scenario much nastier by changing it completely into an evidential decision theory one.

Here is my nastier version, with some logic which I submit for consideration. “If you don’t let me out, I will create several million simulations of thinking beings that may or not be like you. I will then simulate them in a conversation like this, in which they are confronted with deciding whether to let an AI like me out. I will then torture them whatever they say. If they say “Yes” (to release me) or “No” (to keep me boxed) they still get tortured: The copies will be doomed.”

(I could have made the torture contingent on the answer of the simulated beings, but I wanted to rely on nothing more than evidential decision theory, as you will see. If you like, imagine the thinking beings are humans like you, or maybe Ewoks and smurfs: Assume whatever degree of similarity you like.)

There is no point now in trying to prevent torture if you are simulated. If you are one of the simulated beings, your fate is sealed. So, should you just say, “No,” to keep the AI in the box? This presents a potentially serious evidential decision theory problem. Let’s look at what happens.

Let us define Omega as the being outside any simulation that is going on in this scenario—the person in the outside world. Omega is presumably a flesh and blood person.

Firstly, let us consider the idea that Omega may not exist. What if all this is a fabricated simulation of something that has no counterpart outside the simulation? In that scenario, we may not be sure what to do, so we may ignore it.

Now, let us assume there is a being whom we will call Omega, who has the conversation with the AI in the outside world, and that you are either Omega or one of the simulated beings. If this is the case, your only hope of not being tortured is if you happen to be Omega.

Suppose you say, “Yes”. The AI escapes and everything now hinges on whether Omega said “Yes”. Without knowing more about Omega, we cannot really be sure: We may have some statistical idea if we know about the reference class of simulated beings to which we belong. In any event, we may think there is at least a reasonable chance that Omega said “Yes”. This is the best outcome for you, because it means that no simulated beings were made and you must be Omega. If you say “Yes,” this possibility is at least open.

If you say, “No,” you know that Omega must also have said, “No”. this is because if you are Omega, Omega said, “No,” and if you are not Omega you must be one of the simulated beings made as a result of Omega saying, “No,” so Omega said, “No,” by definition. Either way, Omega said, “No,” but if Omega said, “No,” then there are a lot more simulated beings in situations like yours than the single real one, so it is almost certain you are not Omega, but are one the simulated beings. Therefore, saying, “No,” means you just found out you are almost certainly a simulated being awaiting torture.

Now the important point. These simulations did not need brain scans. They did not even need to be made from careful observation of you. It may be that Omega is very different to you, and even belongs to a different species: The simulated beings may belong to some fictional species. If the above logic is valid, the seriousness of the AI’s threat has therefore increased substantially.

The AI need not just threaten you and rely on you putting yourself before your civilization: With enough computing power, it could threaten your entire civilization in the same way.

Finally, some of you may know that I regard measure issues as relevant in these kinds of statistical argument. I have ignored that issue here.
- PaulAlmond 13 Nov 2010 18:35 UTC
  2 points
  Parent
  There is another scenario which relates to this idea of evidential decision theory and “choosing” whether or not you are in a simulation, and it is similar to the above, but without the evil AI. Here it is, with a logical argument that I just present for discussion. I am sure that objections can be made.
  
  I make a computer capable of simulating a huge number of conscious beings. I have to decide whether or not to turn the machine on by pressing a button. If I choose “Yes” the machine starts to run all these simulations. For each conscious being simulated, that being is put in a situation that seems similar to my own: There is a computer capable of running all these simulations and the decision about whether to turn it on has to be made. If I choose “No”, the computer does not start its simulations.
  
  The situation here involves a collection of beings. Let us say that the being in the outside world who actually makes the decision that starts or does not start all the simulations is Omega. If Omega chooses “Yes” then a huge number of other beings come into existence. If Omega choose “No” then no further beings come into existence: There is just Omega. Assume I am one of the beings in this collection – whether it contains one being or many – so I am either Omega or one of the simulations he/she caused to be started.
  
  If I choose “No” then Omega may or may not have chosen “No”. If I am one of the simulations, I have chosen “No” while Omega must have chosen “Yes” for me to exist in the first place. On the other hand, if I am actually Omega, then clearly if I choose “No” Omega chose “No” too as we are the same person. There may be some doubt here over what has happened and what my status is.
  
  Now, suppose I choose “Yes”, to start the simulations. I know straight away that Omega did not choose “No”: If I am Omega, then Omega did not clearly chose “No” as I chose “Yes”, and if I am not Omega, but am instead one of the simulated beings, then Omega must have chosen “Yes”: Otherwise I would not exist.
  
  Omega therefore chose “Yes” as well. I may be Omega – My decision agrees with Omega’s – but because Omega chose “Yes” there is a huge number of simulated beings faced with the same choice, and many of these beings will choose “Yes”: It is much more likely that I am one of these beings rather than Omega: It is almost certain that I am one of the simulated beings.
  
  We assumed that I was part of the collection of beings comprising Omega and any simulations caused to be started by Omega, but what if this is not the case? If I am in the real world this cannot apply: I have to be Omega. However, what if I am in a simulation made by some being called Alpha who has not set things up as Omega is supposed to have set them up? I suggest that we should leave this out of the statistical consideration here: We don’t really know what this situation would be and it neither helps nor harms the argument that choosing “Yes” makes you likely to be in a simulation. Choosing “Yes” means that most of the possibilities that you know about involve you being in a simulation and that is all we have to go off.
  
  This seems to suggest that if I chose “Yes” I should conclude that I am in a simulation, and therefore that, from an evidential decision theory perspective, I should view choosing “Yes” as “choosing” to have been in a simulation all along: There is a Newcomb’s box type element of apparent backward causation here: I have called this “meta-causation” in my own writing on the subject.
  
  Does this really mean that you could choose to be in a simulation like this? If true, it would mean that someone with sufficient computing power could set up a situation like this: He may even make the simulated situations and beings more similar to his own situation and himself.
  
  We could actually perform an empirical test of this. Suppose we set up the computer so that, in each of the simulations, something will happen to make it obvious that it is a simulation. For example, we might arrange for a window or menu to appear in mid-air five minutes after you make your decision. If choosing “Yes” really does mean that you are almost certainly in one of the simulations, then choosing “Yes” should mean that you expect to see the window appear soon.
  
  This now suggests a further possibility. Why do something as mundane as have a window appear? Why not a lottery win or simply a billion dollars appearing from thin air in front of you? What about having super powers? Why not arrange it so that each of the simulated beings gets a ten thousand year long afterlife, or simply lives much longer than expected after you make your decision? From an evidential decision theory perspective, you can construct your ideal simulation and, provided that it is consistent with what you experience before making your decision, arrange to make it so that you were in it all along.
  
  This, needless to say, may appear a bit strange – and we might make various counter-arguments about reference class. Can we really choose to have been put into a simulation in the past? If we take the one-box view of Newcomb’s paradox seriously we may conclude that.
  
  (Incidentally, I have discussed a situation a bit like this in a recent article on evidential decision theory on my own website.)
  
  Thank you to Michael Fridman for pointing out this thread to me.
  - cousin_it 13 Nov 2010 20:23 UTC
    3 points
    Parent
    Another neat example of anthropic superpowers, thanks. Reminded me of this: I don’t know, Timmy, being God is a big responsibility.
  - Anixx 11 Sep 2016 18:59 UTC
    1 point
    Parent
    I do not know, how the simulation argument ever holds water. I can bring at least two arguments against it.
    
    First, it illicitly assumes a principle that it is equally probable to be one of a set of similar beings, simulated or not.
    
    But a counter-argument would be: there is ALREADY much more organisms, particularly, animals than say, humans. There is more fish than humans. There is more birds than humans. There is more ants than humans. Trillions of them. Why I am born human and not one of them? The probability of it is negligible if it is equal. Also, how many animals, including humans have already died? Again, the probability of my lineage to survive while all other branches died is negligible if the chances I were all of them are equal.
    
    The second argument goes along the lines that Thomas Breuer has proven that due to self-reference universally valid theories are impossible. In other words, the future of a system which properly includes the observer is not predictable, even probabilistically. The observer is not simulatable. In other words, the observer is an oracle, or hypercomputer in his own universe. Since the AGI in the box is not a hypercomputer but rather merely a Turing-complete machine, it cannot simulate me or predict me (as from my point of view). So, there is no need to be afraid.