Obviously, the only reflectively consistent answer in this case is
“Yes—here’s the $1000”, because if you’re an agent who expects to
encounter many problems like this in the future, you will self-
modify to be the sort of agent who answers “Yes” to this sort of
question—just like with Newcomb’s Problem or Parfit’s Hitchhiker.
But I don’t have a general theory which replies “Yes”.
If you think being a rational agent includes an infinite ability to modify oneself, then the game has no solution because such an agent would be unable to guarantee the new trait’s continued, unmodified existence without sacrificing the rationality that is a premise of the game.
So, for the game to be solvable, the self-modification ability must have limits, and the limits appear as parameters in the formalism.
An agent can guarantee the persistence of a trait by self-modifying into code that provably can never lead to the modification of that trait. A trivial example is that the agent can self-modify into code that preserves a trait and can’t self-modify.
But more precisely, an agent can guarantee the persistence of a trait only “by self-modifying into code that provably can nevenrlead to the modification of that trait.” Anything tied to rationality that guarantees the existence of a conforming modification at the time of offer must guarantee the continued existence of the same capacity after the modification, making the proposed self-modification self-contradictory.
If you think being a rational agent includes an infinite ability to modify oneself, then the game has no solution because such an agent would be unable to guarantee the new trait’s continued, unmodified existence without sacrificing the rationality that is a premise of the game.
So, for the game to be solvable, the self-modification ability must have limits, and the limits appear as parameters in the formalism.
An agent can guarantee the persistence of a trait by self-modifying into code that provably can never lead to the modification of that trait. A trivial example is that the agent can self-modify into code that preserves a trait and can’t self-modify.
But more precisely, an agent can guarantee the persistence of a trait only “by self-modifying into code that provably can nevenrlead to the modification of that trait.” Anything tied to rationality that guarantees the existence of a conforming modification at the time of offer must guarantee the continued existence of the same capacity after the modification, making the proposed self-modification self-contradictory.