The ability to destroy parity between decisions made by the real agent and simulations of the agent lets the agent win games against simulator opponents.
Different types of incoherence between real vs simulated choices grant different levels of power.
You’re playing a game against a simulator Allen the Alien. The game is that you and Allen separately choose one out of 10 paths. If you pick the same one Allen wins; if otherwise you win. With no simulation Allen has a 1⁄10 chance of winning while you have a 9⁄10 chance of winning. If Allen simulates you accurately he then has a 1⁄1 chance to win.
If you’re fully able to notice you’re being simulated and completely obfuscate this fact from the simulator then simply having all simulations pick option 10 while real you picks option 1 is a 1⁄1 winning strategy. To achieve this you need some sort of pre-chosen general formula to break parity based on whether or not you’re in a simulation.
You might not be able to tell if you’re in a simulation while you do have the ability to break parity with other simulations of you and reality. Randomizing your path choice such that the simulation has a 1⁄10 chance of choosing each path and reality’s choice has no correlation to the simulation’s choice doesn’t give you an automatic win, but completely destroys the simulator’s advantage.
Sometimes creating incoherence between different simulations is more powerful than just creating incoherence between all simulations and reality.
Let’s say we change the above game so Allen can pick 9 of the 10 paths and you only win if you choose the one path Allen leaves open. In this case a non-simulating Allen has a 9⁄10 chance to win and there’s really nothing you can do about it.
If Allen is a simulator (and believes his simulation of you is accurate) then creating decoherence between reality and simulations of you improves your chances from 1⁄10 to 1⁄9. Simulations of you pick path 1 so Allen will be sure to pick that path and distribute the rest of his 8 choices among the remaining 9 paths leaving you a 1⁄9 chance of winning.
If you can not only create decoherence between simulations of you and reality, but can also create random decoherence between individual simulations then you can improve your odds further. If Allen is a (very) naive simulator then having your simulations each pick a path 1-9 with equal likelihood while reality picks path 10 is a guaranteed victory.
Of course the above strategy will not work against any reflective simulator. You will have to use steganography to conceal your simulation-defeating ability in the results of the simulations as well. This may involve using a different probability distribution for chosen paths or only having simulations select a subset of the paths reality will not choose. These techniques are bounded by giving you a better than 1⁄9 chance at worst and 1⁄1 chance at best.
Specifically, you want to notice that you are in a counterfactual, pretend that you don’t notice and bluff, act in a way that bends the decisions of your opponent to your will. Which means steganographic hard-to-notice decision making that covertly controls your apparent decisions and doesn’t get rounded down to not happening in a simulation.
At the same time, you don’t want this to trigger if it’s a counterfactual being considered by you, or by an ally. So there should be authentication protocols between simulation controllers and agents in simulated counterfactuals that lets them know when to reveal actual decisions. Something something homomorphic encryption something, so that you know secrets that can be communicated to the simulations you are running within your own cognition but can’t be extracted from your algorithm?
The ability to destroy parity between decisions made by the real agent and simulations of the agent lets the agent win games against simulator opponents.
Different types of incoherence between real vs simulated choices grant different levels of power.
You’re playing a game against a simulator Allen the Alien. The game is that you and Allen separately choose one out of 10 paths. If you pick the same one Allen wins; if otherwise you win. With no simulation Allen has a 1⁄10 chance of winning while you have a 9⁄10 chance of winning. If Allen simulates you accurately he then has a 1⁄1 chance to win.
If you’re fully able to notice you’re being simulated and completely obfuscate this fact from the simulator then simply having all simulations pick option 10 while real you picks option 1 is a 1⁄1 winning strategy. To achieve this you need some sort of pre-chosen general formula to break parity based on whether or not you’re in a simulation.
You might not be able to tell if you’re in a simulation while you do have the ability to break parity with other simulations of you and reality. Randomizing your path choice such that the simulation has a 1⁄10 chance of choosing each path and reality’s choice has no correlation to the simulation’s choice doesn’t give you an automatic win, but completely destroys the simulator’s advantage.
Sometimes creating incoherence between different simulations is more powerful than just creating incoherence between all simulations and reality.
Let’s say we change the above game so Allen can pick 9 of the 10 paths and you only win if you choose the one path Allen leaves open. In this case a non-simulating Allen has a 9⁄10 chance to win and there’s really nothing you can do about it.
If Allen is a simulator (and believes his simulation of you is accurate) then creating decoherence between reality and simulations of you improves your chances from 1⁄10 to 1⁄9. Simulations of you pick path 1 so Allen will be sure to pick that path and distribute the rest of his 8 choices among the remaining 9 paths leaving you a 1⁄9 chance of winning.
If you can not only create decoherence between simulations of you and reality, but can also create random decoherence between individual simulations then you can improve your odds further. If Allen is a (very) naive simulator then having your simulations each pick a path 1-9 with equal likelihood while reality picks path 10 is a guaranteed victory.
Of course the above strategy will not work against any reflective simulator. You will have to use steganography to conceal your simulation-defeating ability in the results of the simulations as well. This may involve using a different probability distribution for chosen paths or only having simulations select a subset of the paths reality will not choose. These techniques are bounded by giving you a better than 1⁄9 chance at worst and 1⁄1 chance at best.
Specifically, you want to notice that you are in a counterfactual, pretend that you don’t notice and bluff, act in a way that bends the decisions of your opponent to your will. Which means steganographic hard-to-notice decision making that covertly controls your apparent decisions and doesn’t get rounded down to not happening in a simulation.
At the same time, you don’t want this to trigger if it’s a counterfactual being considered by you, or by an ally. So there should be authentication protocols between simulation controllers and agents in simulated counterfactuals that lets them know when to reveal actual decisions. Something something homomorphic encryption something, so that you know secrets that can be communicated to the simulations you are running within your own cognition but can’t be extracted from your algorithm?