It is clearly a difficult problem to estabilish correctly. I think though it could be side-stepped if you happen to found better defined scenarios that dominate “AI torturing people in simulations”. The way to go depends on wether you happen to care less about the future than the present or not. For example, in the first case, there are people that are being tortured right now by other real people, so you might want to concentrate on that.
It is clearly a difficult problem to estabilish correctly. I think though it could be side-stepped if you happen to found better defined scenarios that dominate “AI torturing people in simulations”. The way to go depends on wether you happen to care less about the future than the present or not.
For example, in the first case, there are people that are being tortured right now by other real people, so you might want to concentrate on that.