Maybe the difference is between “respond” and “treat”.
It seems like you are saying that the caravan commits to “[responding to demand unarmed the same as it responds to demand armed]” but doesn’t commit to “[and I will choose my probability of resisting based on the equilibrium value in the world where you always demand armed if you demand at all]”. In contrast, I include that as part of the commitment.
When I say “[treating demand unarmed the same as it treats demand armed]” I mean “you first determine the correct policy to follow if there was only demand armed, then you follow that policy even if the bandits demand unarmed”; you cannot then later “instruct it to always resist” under this formulation (otherwise you have changed the policy and are not using SPI-as-I’m-thinking-of-it).
I think the bandits should only accept commitments of the second type as a reason to demand unarmed, and not accept commitments of the first type, precisely because with commitments of the first type the bandits will be worse off if they demand unarmed than if they had demanded armed. That’s why I’m primarily thinking of the second type.
Perfect, that is indeed the diffeence. I agree with all of what you write here.
In this light, the reason for my objection is that I understand how we can make a commitment of the first type, but I have no clue how to make a commitment of the second type. (In our specific example, once demand unarmed is an option—once SPI is in use—the counterfactual world where there is only demand armed just seems so different. Wouldn’t history need to go very differently? Perhaps it wouldn’t even be clear what “you” is in that world?)
But I agree that with SDA-AGIs, the second type of commitment becomes more realistic. (Although, the potential line of thinking mentioned by Caspar applies here: Perhaps those AGIs will come up with SPI-or-something on their own, so there is less value in thinking about this type of SPI now.)
Maybe the difference is between “respond” and “treat”.
It seems like you are saying that the caravan commits to “[responding to demand unarmed the same as it responds to demand armed]” but doesn’t commit to “[and I will choose my probability of resisting based on the equilibrium value in the world where you always demand armed if you demand at all]”. In contrast, I include that as part of the commitment.
When I say “[treating demand unarmed the same as it treats demand armed]” I mean “you first determine the correct policy to follow if there was only demand armed, then you follow that policy even if the bandits demand unarmed”; you cannot then later “instruct it to always resist” under this formulation (otherwise you have changed the policy and are not using SPI-as-I’m-thinking-of-it).
I think the bandits should only accept commitments of the second type as a reason to demand unarmed, and not accept commitments of the first type, precisely because with commitments of the first type the bandits will be worse off if they demand unarmed than if they had demanded armed. That’s why I’m primarily thinking of the second type.
Perfect, that is indeed the diffeence. I agree with all of what you write here.
In this light, the reason for my objection is that I understand how we can make a commitment of the first type, but I have no clue how to make a commitment of the second type. (In our specific example, once demand unarmed is an option—once SPI is in use—the counterfactual world where there is only demand armed just seems so different. Wouldn’t history need to go very differently? Perhaps it wouldn’t even be clear what “you” is in that world?)
But I agree that with SDA-AGIs, the second type of commitment becomes more realistic. (Although, the potential line of thinking mentioned by Caspar applies here: Perhaps those AGIs will come up with SPI-or-something on their own, so there is less value in thinking about this type of SPI now.)
Yeah, I agree with all of that.