Yeah I think I agree with this. Although I can sort of see the intuition this is coming from (“If you don’t do this, I’ll torture you” --> just make it impossible to feel pain beyond a certain magnitude), I think this causes some hefty problems in other areas, such as “If you don’t do this, I’ll kill you.” If an agent is trying to maximize the sum of its utility over it’s entire lifetime, murdering it would result in the loss of that entire sum of utility, and the agent has every incentive to preserve itself. That might mean giving in to the extortion. Whatever loss was incurred by giving into the extortion might be made up for by not being killed, and preserving it’s ability to continue performing actions, if that were the situation.
Yeah I think I agree with this. Although I can sort of see the intuition this is coming from (“If you don’t do this, I’ll torture you” --> just make it impossible to feel pain beyond a certain magnitude), I think this causes some hefty problems in other areas, such as “If you don’t do this, I’ll kill you.” If an agent is trying to maximize the sum of its utility over it’s entire lifetime, murdering it would result in the loss of that entire sum of utility, and the agent has every incentive to preserve itself. That might mean giving in to the extortion. Whatever loss was incurred by giving into the extortion might be made up for by not being killed, and preserving it’s ability to continue performing actions, if that were the situation.