What keeps the AI from immediately changing itself to only care about the people’s current utility function? That’s a change with very high expected utility defined in terms of their current utility function and one with little tendency to change their current utility function.
Will you believe that a simple hack will work with lower confidence next time?
What keeps the AI from immediately changing itself to only care about the people’s current utility function? That’s a change with very high expected utility defined in terms of their current utility function and one with little tendency to change their current utility function.
Will you believe that a simple hack will work with lower confidence next time?
Slightly. I was counting on this one getting bashed into shape by the comments; it wasn’t so in future, I’ll try and do more of the bashing myself.