- select someone you want to coordinate with without any defection risks - share this idea with them. it only works if they also have the chance to condition their actions on it. - general note to maybe make reading easier: this is fully symmetric. - after the acute risk period, in futures where it’s possible: run a simulation of the other person (and you). - the simulation will start in this current situation, and will be free to terminate when actions are no longer long-term relevant. the simulation will have almost exactly the same starting state and will develop over time in the same way. - there will be one change to the version of you in the simulation. this change is that the version of you in the simulation will have some qualities replaced with those of your fellow supercoordinator. these qualities will be any of those which could motivate defection from a CDT agent: such as (1) a differing utility function and maybe (2) differing meta-beliefs about whose beliefs are more likely to be correct under disagreement. - this is to appear to be an ‘isolated’ change. in other words, the rest of the world will seem coherent in the view of the simulated version of you. it’s not truly coherent for this to be isolated, because it would require causally upstream factors. however, it will seem coherent because the belief state of the simulated version of you will be modified to make the simulated world seem coherent even if it’s not really because of this. - given this, you’re unsure which world you’re acting in. - if you’re the simulacra and you defect, this logically corresponds to the ‘real’ version of you defecting. - recall that the real version of you has the utility functions and/or action-relevant beliefs of your (in-simulation) supercoordinator. - because of that, by defecting, it is 50⁄50 which ‘qualities’ (mentioned above) the effects of your actions will be under: ‘yours or theirs.’ therefor, the causal EV of defection will always be at most 0. - often it will be less than 0, because the average of you two expects positive EV from collaboration, and defecting loses that in both possible worlds (negative EV).
part 2. the critical flaw
one can exploit this policy by engaging in the following reasoning. (it might be fun to see if you notice it before reading on :p)
1. some logical probabilities, specifically those about what a similar agent would do, depend on my actions. ie, i expect a copy of me to act as i do. 2. i can defect and then not simulate them. 3. this logically implies that i would not be simulated. 4. therefor i can do this and narrow down the space of logically-possible realities to those where i am not in this sort of simulation.
when i first wrote this i was hoping to write a part 3. how to avoid the flaw, but i’ve updated towards it being impossible.
a super-coordination story with a critical flaw
part 1. supercoordination story
- select someone you want to coordinate with without any defection risks
- share this idea with them. it only works if they also have the chance to condition their actions on it.
- general note to maybe make reading easier: this is fully symmetric.
- after the acute risk period, in futures where it’s possible: run a simulation of the other person (and you).
- the simulation will start in this current situation, and will be free to terminate when actions are no longer long-term relevant. the simulation will have almost exactly the same starting state and will develop over time in the same way.
- there will be one change to the version of you in the simulation. this change is that the version of you in the simulation will have some qualities replaced with those of your fellow supercoordinator. these qualities will be any of those which could motivate defection from a CDT agent: such as (1) a differing utility function and maybe (2) differing meta-beliefs about whose beliefs are more likely to be correct under disagreement.
- this is to appear to be an ‘isolated’ change. in other words, the rest of the world will seem coherent in the view of the simulated version of you. it’s not truly coherent for this to be isolated, because it would require causally upstream factors. however, it will seem coherent because the belief state of the simulated version of you will be modified to make the simulated world seem coherent even if it’s not really because of this.
- given this, you’re unsure which world you’re acting in.
- if you’re the simulacra and you defect, this logically corresponds to the ‘real’ version of you defecting.
- recall that the real version of you has the utility functions and/or action-relevant beliefs of your (in-simulation) supercoordinator.
- because of that, by defecting, it is 50⁄50 which ‘qualities’ (mentioned above) the effects of your actions will be under: ‘yours or theirs.’ therefor, the causal EV of defection will always be at most 0.
- often it will be less than 0, because the average of you two expects positive EV from collaboration, and defecting loses that in both possible worlds (negative EV).
part 2. the critical flaw
one can exploit this policy by engaging in the following reasoning. (it might be fun to see if you notice it before reading on :p)
1. some logical probabilities, specifically those about what a similar agent would do, depend on my actions. ie, i expect a copy of me to act as i do.
2. i can defect and then not simulate them.
3. this logically implies that i would not be simulated.
4. therefor i can do this and narrow down the space of logically-possible realities to those where i am not in this sort of simulation.
when i first wrote this i was hoping to write a part 3. how to avoid the flaw, but i’ve updated towards it being impossible.