it has no incentive to change your long term utility function
By practical difference I meant that it wants to erase the impact of your goals on the universe. Whether it does that by changing your goals or not depends on implementation details.
Consider the perverse case of someone who wants to die—their utility function ranks futures of the universe lower if they’re in it, and higher if they’re not. You can’t maximize this person’s empowerment if they’re dead, so either you should convince them life is worth living, or you should just prevent them from affecting the universe.
By practical difference I meant that it wants to erase the impact of your goals on the universe.
Not it does not in general. The Franzmeyer et al prototype does not do that, and there are no reasons to suspect that becomes some universal problem as you scale these systems up.
Once again:
Optimizing for your long term empowerment is (for most agents) equivalent to optimizing for your ability to achieve your true long term goals/values, whatever they are.
An agent truly seeking your empowerment is seeking to give you power over itself as well, which precludes any effect of “erasing the impact of your goals”.
Consider the perverse case of someone who wants to die -
Sure and humans usually try to prevent humans from wanting to die.
Short comment on the last point—euthanasia is legal in several countries (thus wanting to die is not prevented, and even socially accepted) and in my opinion the moral choice of action in certain situations.
By practical difference I meant that it wants to erase the impact of your goals on the universe. Whether it does that by changing your goals or not depends on implementation details.
Consider the perverse case of someone who wants to die—their utility function ranks futures of the universe lower if they’re in it, and higher if they’re not. You can’t maximize this person’s empowerment if they’re dead, so either you should convince them life is worth living, or you should just prevent them from affecting the universe.
Not it does not in general. The Franzmeyer et al prototype does not do that, and there are no reasons to suspect that becomes some universal problem as you scale these systems up.
Once again:
Optimizing for your long term empowerment is (for most agents) equivalent to optimizing for your ability to achieve your true long term goals/values, whatever they are.
An agent truly seeking your empowerment is seeking to give you power over itself as well, which precludes any effect of “erasing the impact of your goals”.
Sure and humans usually try to prevent humans from wanting to die.
Short comment on the last point—euthanasia is legal in several countries (thus wanting to die is not prevented, and even socially accepted) and in my opinion the moral choice of action in certain situations.