My problem is with situations where we are supposedly interacting with a “distant superintelligence”. How do you know it exists, how do you know what it wants, how do you justify allowing it to affect your decisions, how do you justify allowing one particular hypothetical “distant” entity to affect you rather than any of the googolplex other possible entities with their own different agendas?
Newcomb’s paradox, by contrast, doesn’t require so many leaps of faith. All that’s required is believing that Omega can predict you accurately, something which can be a reasonable belief under certain circumstances (e.g. if you are a program and Omega has a copy of you).
I am interested in the idea that one-boxing can actually be obtained via causal decision theory after all, but that is independent of my criticisms of these more ambitious acausal scenarios.
In this case, how things are known is kind of a technical detail. The main point is that programs can be copied, and you can use the copy to predict the behavior of the original, and this is therefore a situation in which “acausal” reasoning can make sense.
The “technical detail” pertains to what it even means for a program to believe or know something. There will be some kind of minimal capacity to represent facts, that is required for the program to reason its way to one-boxing. The greater the demands we make of its cognitive process—e.g. by reducing the number of facts about its situation, that we allow it to just assume in advance—the greater the cognitive complexity it must possess.
Incidentally, when I said “if you are a program...”, I didn’t mean a being who feels human and is having humanlike experiences, but is actually a program. I just meant a computer program that represents facts and makes decisions.
Anyway—do you have any answers for my questions? For example, how do you figure out what the distant superintelligence wants you to do?
If a human being doesn’t automatically qualify as a program to you, then we are having a much deeper disagreement than I anticipated. I doubt we can go any further until we reach agreement on whether all human beings are programs.
My attempt to answer the question you just restated anyway:
The idea is that you would figure out what the distant superintelligence wanted you to do the same way you would figure out what another human being who wasn’t being verbally straight with you, wanted you to do: by picking up on its hints.
Of course this prototypically goes disastrously. Hence the vast cross-cultural literature warning against bargaining with demons and ~0 stories depicting it going well. So you should not actually do it.
Does this mean, assume you’re in a simulation, and look for messages from the simulators?
Because that seems to be a scenario different from acausal blackmail. In acausal blackmail, the recipient of the blackmail is supposed to start out thinking they are here on Earth, then they hypothesize a distant superintelligence which is simulating an exact duplicate of themselves as a hostage, then they decide that they can’t know if they are the original on Earth or the duplicate, and then carry out certain actions just in case they are the hostage (or, for the sake of the hostage, since the hostage presumably carries out the same actions as the Earth original).
Now, the original on Earth absolutely cannot receive messages or “hints” from the distant superintelligence. They are causally isolated from it. Yet the simulated hostage is supposed to be identical. That means the hostage can’t be receiving hints either.
On the other hand, if you are receiving hints (and not just imagining them), then you are definitely in a simulation, and your situation is simply that you are a simulated being at the mercy of simulators. There’s no ambiguity about your status, and acausal decision theory is not relevant.
Is Bostrom’s original Simulation Hypothesis, the version involving ancestor-simulations, unconvincing to you? If you have decided to implement an epistemic exclusion in yourself with respect to the question of whether we are in a simulation, it is not my business to interfere with that. But we do, for predictive purposes, have to think about the fact that Bostrom’s
Simulation Hypothesis and other arguments in that vein will probably not be entirely unconvincing [by default] to any ASIs we build, given that they are not entirely unconvincing to the majority of the intelligent human population.
I am not in any way excluding the possibility of being in a simulation. I am only saying that one particular scenario that involves simulation does not make sense to me. I am asking for some way in which “acausal blackmail by a distant superintelligence” can make sense—can be rational as a belief or an action.
As I see it, by definition of the scenario, the “blackmailer” cannot communicate with the simulated entity. But then the simulated entity—to say nothing of the original, who is supposed to be the ultimate target of the blackmail—has no way of knowing what the blackmailer wants.
My problem is with situations where we are supposedly interacting with a “distant superintelligence”. How do you know it exists, how do you know what it wants, how do you justify allowing it to affect your decisions, how do you justify allowing one particular hypothetical “distant” entity to affect you rather than any of the googolplex other possible entities with their own different agendas?
Newcomb’s paradox, by contrast, doesn’t require so many leaps of faith. All that’s required is believing that Omega can predict you accurately, something which can be a reasonable belief under certain circumstances (e.g. if you are a program and Omega has a copy of you).
I am interested in the idea that one-boxing can actually be obtained via causal decision theory after all, but that is independent of my criticisms of these more ambitious acausal scenarios.
How would you know that you were a program and Omega had a copy of you? If you knew that, how would you know that you weren’t that copy?
In this case, how things are known is kind of a technical detail. The main point is that programs can be copied, and you can use the copy to predict the behavior of the original, and this is therefore a situation in which “acausal” reasoning can make sense.
The “technical detail” pertains to what it even means for a program to believe or know something. There will be some kind of minimal capacity to represent facts, that is required for the program to reason its way to one-boxing. The greater the demands we make of its cognitive process—e.g. by reducing the number of facts about its situation, that we allow it to just assume in advance—the greater the cognitive complexity it must possess.
Incidentally, when I said “if you are a program...”, I didn’t mean a being who feels human and is having humanlike experiences, but is actually a program. I just meant a computer program that represents facts and makes decisions.
Anyway—do you have any answers for my questions? For example, how do you figure out what the distant superintelligence wants you to do?
If a human being doesn’t automatically qualify as a program to you, then we are having a much deeper disagreement than I anticipated. I doubt we can go any further until we reach agreement on whether all human beings are programs.
My attempt to answer the question you just restated anyway:
The idea is that you would figure out what the distant superintelligence wanted you to do the same way you would figure out what another human being who wasn’t being verbally straight with you, wanted you to do: by picking up on its hints.
Of course this prototypically goes disastrously. Hence the vast cross-cultural literature warning against bargaining with demons and ~0 stories depicting it going well. So you should not actually do it.
Does this mean, assume you’re in a simulation, and look for messages from the simulators?
Because that seems to be a scenario different from acausal blackmail. In acausal blackmail, the recipient of the blackmail is supposed to start out thinking they are here on Earth, then they hypothesize a distant superintelligence which is simulating an exact duplicate of themselves as a hostage, then they decide that they can’t know if they are the original on Earth or the duplicate, and then carry out certain actions just in case they are the hostage (or, for the sake of the hostage, since the hostage presumably carries out the same actions as the Earth original).
Now, the original on Earth absolutely cannot receive messages or “hints” from the distant superintelligence. They are causally isolated from it. Yet the simulated hostage is supposed to be identical. That means the hostage can’t be receiving hints either.
On the other hand, if you are receiving hints (and not just imagining them), then you are definitely in a simulation, and your situation is simply that you are a simulated being at the mercy of simulators. There’s no ambiguity about your status, and acausal decision theory is not relevant.
Is Bostrom’s original Simulation Hypothesis, the version involving ancestor-simulations, unconvincing to you? If you have decided to implement an epistemic exclusion in yourself with respect to the question of whether we are in a simulation, it is not my business to interfere with that. But we do, for predictive purposes, have to think about the fact that Bostrom’s Simulation Hypothesis and other arguments in that vein will probably not be entirely unconvincing [by default] to any ASIs we build, given that they are not entirely unconvincing to the majority of the intelligent human population.
I am not in any way excluding the possibility of being in a simulation. I am only saying that one particular scenario that involves simulation does not make sense to me. I am asking for some way in which “acausal blackmail by a distant superintelligence” can make sense—can be rational as a belief or an action.
As I see it, by definition of the scenario, the “blackmailer” cannot communicate with the simulated entity. But then the simulated entity—to say nothing of the original, who is supposed to be the ultimate target of the blackmail—has no way of knowing what the blackmailer wants.