Mind-reading violates the cartesian assumption and so we can’t reason about it formally (yet!), but i think there’s a version of effectively getting what you’re after that doesn’t.
Well, as long as SA is wired to “get out of the way if A starts moving”, then the optimal R-maximising policy is always to move towards the red button; anything else is clearly not R-maximising (note that SA doesn’t need to “know” anything; just be programmed to have a different policy depending on how A moves, with A itself setting this up to signal whether it’s R-maximising or not).
But in any case, that specific problem can be overcome with the right rollouts.
Mind-reading violates the cartesian assumption and so we can’t reason about it formally (yet!), but i think there’s a version of effectively getting what you’re after that doesn’t.
Well, as long as SA is wired to “get out of the way if A starts moving”, then the optimal R-maximising policy is always to move towards the red button; anything else is clearly not R-maximising (note that SA doesn’t need to “know” anything; just be programmed to have a different policy depending on how A moves, with A itself setting this up to signal whether it’s R-maximising or not).
But in any case, that specific problem can be overcome with the right rollouts.