SOTA: Penalize my action by how well a maximizer that takes my place after the action would maximize a wide variety of goals.
If we use me instead of the maximizer, paradoxes of self-reference arise that we can resolve by inserting a modal operator: Penalize my action by how well I expect I would maximize a wide variety of goals (if given that goal). Then when considering the action of stepping towards an omnipotence button, I would expect that given that I decided to take one step, I would take more, and therefore penalize the first step a lot. Except if there’s plausible deniability, because the first step towards the button is also a first step towards my concrete goal, because then I might still expect to be bound by the penalty.
SOTA: Penalize my action by how well a maximizer that takes my place after the action would maximize a wide variety of goals.
If we use me instead of the maximizer, paradoxes of self-reference arise that we can resolve by inserting a modal operator: Penalize my action by how well I expect I would maximize a wide variety of goals (if given that goal). Then when considering the action of stepping towards an omnipotence button, I would expect that given that I decided to take one step, I would take more, and therefore penalize the first step a lot. Except if there’s plausible deniability, because the first step towards the button is also a first step towards my concrete goal, because then I might still expect to be bound by the penalty.
I’ve suggested using myself before in the last sentence of this comment: https://www.lesswrong.com/posts/mdQEraEZQLg7jtozn/subagents-and-impact-measures-full-and-fully-illustrated?commentId=WGWtoKDrnN3o6cS6G