I think the scenario of an AI torturing humans in the future is very, very unlikely. For most possible goals an AI could have, it will have ways to accomplish them that are more effective than torturing humans.
The chance of an AI torturing humans as a means to some other goal does seem low, but what about the AI torturing humans as a end in itself? I think CEV could result in this with non-negligible probability (>0.000001). I wouldn’t be surprised if the typical LessWrong poster has very different morality than the majority of the population, so our intuition of the results of CEV could be very wrong.
Note that it does not suffice for us to have different conscious morality or different verbal statements of values. That only matters if the difference remains under extrapolation, eg, of what others would want if they knew there weren’t any deities.
I think the scenario of an AI torturing humans in the future is very, very unlikely. For most possible goals an AI could have, it will have ways to accomplish them that are more effective than torturing humans.
The chance of an AI torturing humans as a means to some other goal does seem low, but what about the AI torturing humans as a end in itself? I think CEV could result in this with non-negligible probability (>0.000001). I wouldn’t be surprised if the typical LessWrong poster has very different morality than the majority of the population, so our intuition of the results of CEV could be very wrong.
Note that it does not suffice for us to have different conscious morality or different verbal statements of values. That only matters if the difference remains under extrapolation, eg, of what others would want if they knew there weren’t any deities.