There are many such operators, and different ones give different answers when presented with the same agent. Only a human utility function distinguishes the right way of interpreting a human mind as having a utility function from all of the wrong ways of interpreting a human mind as having a utility function. So you need to get a bunch of Friendliness Theory right before you can bootstrap.
Why do you think there are many such operators? Do you believe the concept of “utility function of an agent” is ill-defined (assuming the “agent” is actually an intelligent agent rather than e.g. a rock)? Do you think it is possible to interpret a paperclip maximizer as having a utility function other than maximizing paperclips?
Deducing the correct utility of a utility maximiser is one thing (which has a low level of uncertainty, higher if the agent is hiding stuff). Assigning a utility to an agent that doesn’t have one is quite another.
There are many such operators, and different ones give different answers when presented with the same agent. Only a human utility function distinguishes the right way of interpreting a human mind as having a utility function from all of the wrong ways of interpreting a human mind as having a utility function. So you need to get a bunch of Friendliness Theory right before you can bootstrap.
Why do you think there are many such operators? Do you believe the concept of “utility function of an agent” is ill-defined (assuming the “agent” is actually an intelligent agent rather than e.g. a rock)? Do you think it is possible to interpret a paperclip maximizer as having a utility function other than maximizing paperclips?
Deducing the correct utility of a utility maximiser is one thing (which has a low level of uncertainty, higher if the agent is hiding stuff). Assigning a utility to an agent that doesn’t have one is quite another.
See http://lesswrong.com/lw/6ha/the_blueminimizing_robot/ Key quote:
Replied in the other thread.