I’ve also tried applying this theory to UDT, and have run into similar 5-and-10-ish problems (though I hadn’t considered making the reward depend on a statement like G, that’s a nice trick!). My tentative conclusion is that the reflection principle is too weak to have much teeth when considering a version of UDT based on conditional expected utility, because for all actions A that the agent doesn’t take, we have P(Agent() = A) = 0; we might still have P(“Agent() = A”) > 0 (but smaller than epsilon), but the reflection axioms do not need to hold conditional on Agent() = A, i.e., for X a reflection axiom we can have P assign positive probability to e.g. P(“X & Agent() = A”) / P(“Agent() = A”) < 0.9.
But it’s difficult to ask for more. In order to evaluate the expected utility conditional on choosing A, we need to coherently imagine a world in which the agent would choose A, and if we also asked the probability distribution conditional on choosing A to satisfy the reflection axioms, then choosing A would not be optimal conditional on choosing A—contradiction to the agent choosing A… (We could have P(“Agent() = A”) = 0, but not if you have the agent playing chicken, i.e., play A if P(“Agent() = A”); if we have such a chicken-playing agent, we can coherently imagine a world in which it would play A—namely, a world in which P(“Agent() = A”) = 0 -- but this is a world that assigns probability zero to itself. To make this formal, replace “world” by “complete theory”.)
I think applying this theory to UDT will need more insights. One thing to play with is a formalization of classical game theory:
Specify a decision problem by a function from (a finite set of) possible actions to utilities. This function is allowed to be written in the full formal language containing P(”.”).
Specify a universal agent which takes a decision problem D(.), evaluates the expected utility of every action—not in the UDT way of conditioning on Agent(D) = A, but by simply evaluating the expectation of D(A) under P(”.”) -- and returns the action with the highest expected utility.
Specify a game by a payoff function, which is a function from pure strategy profiles (which assign a pure strategy to every player) to utilities for every player.
Given a game G(.), for every player, recursively define actions A_i := Agent(D_i) and decision problems D_i(a) := G_i(A_1, …, A_(i-1), a, A_(i+1), …, A_n), where G_i is the i’th component of G (i.e., the utility of player i).
Then, (A_1, …, A_n) will be a Nash equilibrium of the game G. I believe it’s also possible to show that for every Nash equilibrium, there is a P(.) satisfying reflection which makes the players play this NE, but I have yet to work carefully through the proof. (Of course we don’t want to become classical economists who believe in defection on the one-shot prisoner’s dilemma, but perhaps thinking about this a bit might help with finding an insight for making an interesting version of UDT work. It seems worth spending at least a bit of time on.)
I’ve also tried applying this theory to UDT, and have run into similar 5-and-10-ish problems (though I hadn’t considered making the reward depend on a statement like G, that’s a nice trick!). My tentative conclusion is that the reflection principle is too weak to have much teeth when considering a version of UDT based on conditional expected utility, because for all actions A that the agent doesn’t take, we have P(Agent() = A) = 0; we might still have P(“Agent() = A”) > 0 (but smaller than epsilon), but the reflection axioms do not need to hold conditional on Agent() = A, i.e., for X a reflection axiom we can have P assign positive probability to e.g. P(“X & Agent() = A”) / P(“Agent() = A”) < 0.9.
But it’s difficult to ask for more. In order to evaluate the expected utility conditional on choosing A, we need to coherently imagine a world in which the agent would choose A, and if we also asked the probability distribution conditional on choosing A to satisfy the reflection axioms, then choosing A would not be optimal conditional on choosing A—contradiction to the agent choosing A… (We could have P(“Agent() = A”) = 0, but not if you have the agent playing chicken, i.e., play A if P(“Agent() = A”); if we have such a chicken-playing agent, we can coherently imagine a world in which it would play A—namely, a world in which P(“Agent() = A”) = 0 -- but this is a world that assigns probability zero to itself. To make this formal, replace “world” by “complete theory”.)
I think applying this theory to UDT will need more insights. One thing to play with is a formalization of classical game theory:
Specify a decision problem by a function from (a finite set of) possible actions to utilities. This function is allowed to be written in the full formal language containing P(”.”).
Specify a universal agent which takes a decision problem D(.), evaluates the expected utility of every action—not in the UDT way of conditioning on Agent(D) = A, but by simply evaluating the expectation of D(A) under P(”.”) -- and returns the action with the highest expected utility.
Specify a game by a payoff function, which is a function from pure strategy profiles (which assign a pure strategy to every player) to utilities for every player.
Given a game G(.), for every player, recursively define actions A_i := Agent(D_i) and decision problems D_i(a) := G_i(A_1, …, A_(i-1), a, A_(i+1), …, A_n), where G_i is the i’th component of G (i.e., the utility of player i).
Then, (A_1, …, A_n) will be a Nash equilibrium of the game G. I believe it’s also possible to show that for every Nash equilibrium, there is a P(.) satisfying reflection which makes the players play this NE, but I have yet to work carefully through the proof. (Of course we don’t want to become classical economists who believe in defection on the one-shot prisoner’s dilemma, but perhaps thinking about this a bit might help with finding an insight for making an interesting version of UDT work. It seems worth spending at least a bit of time on.)