I have used that term for this, but it’s not very precise: the Wikipedia entry has the monster absorbing positive utility rather than threatening negative, and there’s no mention of self-modification.
The self-modification isn’t in itself the issue though is it? It seems to me that just about any sort of agent would be willing to self-modify into a utility monster if it had an expectation of that strategy being more likely to achieve its goals, and the pleasure/pain distinction is simply adding a constant (negative) offset to all utilities (which is meaningless since utility functions are generally assumed to be invariant under affine transformations).
I don’t even think it’s a subset of utility monster, it’s just a straight up “agent deciding to become a utility monster because that furthers its goals”.
I have used that term for this, but it’s not very precise: the Wikipedia entry has the monster absorbing positive utility rather than threatening negative, and there’s no mention of self-modification.
The self-modification isn’t in itself the issue though is it? It seems to me that just about any sort of agent would be willing to self-modify into a utility monster if it had an expectation of that strategy being more likely to achieve its goals, and the pleasure/pain distinction is simply adding a constant (negative) offset to all utilities (which is meaningless since utility functions are generally assumed to be invariant under affine transformations).
I don’t even think it’s a subset of utility monster, it’s just a straight up “agent deciding to become a utility monster because that furthers its goals”.