What does it even mean “universe is in a state that is worth utility uFA but really leads to a state worth utility uTA”—utility functions—however worthless they really are only make sense to some agents’ opinions.
Do you mean agent’s utility function doesn’t follow his programmer’s utility function; agent’s utility function is inconsistent; agent’s utility function is ok but his analysis of the world is inconsistent, so he gets confused; we figured out One True Utility Function but decided not to program it into agent; or what ?
The agent will falsely believe the universe is in one state, with a certain utility, but in reality the the universe is in a different state, with a different utility.
I have reworded that sentence to hopefully make this clearer.
What does it even mean “universe is in a state that is worth utility uFA but really leads to a state worth utility uTA”—utility functions—however worthless they really are only make sense to some agents’ opinions.
Do you mean agent’s utility function doesn’t follow his programmer’s utility function; agent’s utility function is inconsistent; agent’s utility function is ok but his analysis of the world is inconsistent, so he gets confused; we figured out One True Utility Function but decided not to program it into agent; or what ?
The agent will falsely believe the universe is in one state, with a certain utility, but in reality the the universe is in a different state, with a different utility.
I have reworded that sentence to hopefully make this clearer.