Proposal 4: same as proposal 3 but each agent also obeys commitments that they would have made from behind a veil of ignorance where they didn’t yet know who they were or what their values were. From that position, they wouldn’t have wanted to do future destructive commitment races.
I don’t think this solves Commitment Races in general, because of two different considerations:
Trivially, I can say that you still have the problem when everyone needs to bootstrap a Schelling veil of ignorance.
Less trivially, even behind the most simple/Schelling veils of ignorance, I find it likely that hawkish commitments are incentivized. For example, the veil might say that you might be Powerful agent A, or Weak agent B, and if some Powerful agents have weird enough utilities (and this seems likely in a big pool of agents), hawkishly committing in case you are A will be a net-positive bet.
This might still mostly solve Commitment Races in our particular multi-verse. I have intuitions both for and against this bootstrapping being possible. I’d be interested to hear yours.
Trivially, I can say that you still have the problem when everyone needs to bootstrap a Schelling veil of ignorance.
I don’t understand your point here, explain?
even behind the most simple/Schelling veils of ignorance, I find it likely that hawkish commitments are incentivized. For example, the veil might say that you might be Powerful agent A, or Weak agent B, and if some Powerful agents have weird enough utilities (and this seems likely in a big pool of agents), hawkishly committing in case you are A will be a net-positive bet.
This seems to be claiming that in some multiverses, the gains to powerful agents from being hawkish outweigh the losses to weak agents. But then why is this a problem? It just seems like the optimal outcome.
Say there are 5 different veils of ignorance (priors) that most minds consider Schelling (you could try to argue there will be exactly one, but I don’t see why).
If everyone simply accepted exactly the same one, then yes, lots of nice things would happen and you wouldn’t get catastrophically inefficient conflict.
But every one of these 5 priors will have different outcomes when it is implemented by everyone. For example, maybe in prior 3 agent A is slightly better off and agent B is slightly worse off.
So you need to give me a reason why a commitment race doesn’t recur in the level of “choosing which of the 5 priors everyone should implement”. That is, maybe A will make a very early commitment to only every implement prior 3. As always, this is rational if A thinks the others will react a certain way (give in to the threat and implement 3). And I don’t have a reason to expect agents not to have such priors (although I agree they are slightly less likely than more common-sensical priors).
That is, as always, the commitment races problem doesn’t have a general solution on paper. You need to get into the details of our multi-verse and our agents to argue that they won’t have these crazy priors and will coordinate well.
This seems to be claiming that in some multiverses, the gains to powerful agents from being hawkish outweigh the losses to weak agents. But then why is this a problem? It just seems like the optimal outcome.
It seems likely that in our universe there are some agents with arbitrarily high gains-from-being-hawkish, that don’t have correspondingly arbitrarily low measure. (This is related to Pascalian reasoning, see Daniel’s sequence.) For example, someone whose utility is exponential on number of paperclips. I don’t agree that the optimal outcome (according to my ethics) is for me (who’s utility is at most linear on happy people) to turn all my resources into paperclips. Maybe if I was a preference utilitarian biting enough bullets, this would be the case. But I just want happy people.
Nice!
I don’t think this solves Commitment Races in general, because of two different considerations:
Trivially, I can say that you still have the problem when everyone needs to bootstrap a Schelling veil of ignorance.
Less trivially, even behind the most simple/Schelling veils of ignorance, I find it likely that hawkish commitments are incentivized. For example, the veil might say that you might be Powerful agent A, or Weak agent B, and if some Powerful agents have weird enough utilities (and this seems likely in a big pool of agents), hawkishly committing in case you are A will be a net-positive bet.
This might still mostly solve Commitment Races in our particular multi-verse. I have intuitions both for and against this bootstrapping being possible. I’d be interested to hear yours.
I don’t understand your point here, explain?
This seems to be claiming that in some multiverses, the gains to powerful agents from being hawkish outweigh the losses to weak agents. But then why is this a problem? It just seems like the optimal outcome.
Say there are 5 different veils of ignorance (priors) that most minds consider Schelling (you could try to argue there will be exactly one, but I don’t see why).
If everyone simply accepted exactly the same one, then yes, lots of nice things would happen and you wouldn’t get catastrophically inefficient conflict.
But every one of these 5 priors will have different outcomes when it is implemented by everyone. For example, maybe in prior 3 agent A is slightly better off and agent B is slightly worse off.
So you need to give me a reason why a commitment race doesn’t recur in the level of “choosing which of the 5 priors everyone should implement”. That is, maybe A will make a very early commitment to only every implement prior 3. As always, this is rational if A thinks the others will react a certain way (give in to the threat and implement 3). And I don’t have a reason to expect agents not to have such priors (although I agree they are slightly less likely than more common-sensical priors).
That is, as always, the commitment races problem doesn’t have a general solution on paper. You need to get into the details of our multi-verse and our agents to argue that they won’t have these crazy priors and will coordinate well.
It seems likely that in our universe there are some agents with arbitrarily high gains-from-being-hawkish, that don’t have correspondingly arbitrarily low measure. (This is related to Pascalian reasoning, see Daniel’s sequence.) For example, someone whose utility is exponential on number of paperclips. I don’t agree that the optimal outcome (according to my ethics) is for me (who’s utility is at most linear on happy people) to turn all my resources into paperclips.
Maybe if I was a preference utilitarian biting enough bullets, this would be the case. But I just want happy people.