Let’s consider the agent given in A model of UDT with a halting oracle. One will notice that that agent is not quite well defined because it doesn’t tell us in what order we are supposed to consider actions in step 1. But surely that doesn’t matter, right? Wrong.
Let’s consider the prisoner dilemma with payment matrix given by
1: C
1: D
2: C
(3, 3)
(5, 0)
2: D
(0, 5)
(2, 2)
and consider agent A which consider whether there is a proof that A()≠D before considering whether there is a proof that A()≠C and agent A’ which do things in the opposite order. If A or A’ is pitted against itself everything is well and mutual cooperation is the result of the game but what if A is pitted against A’? Then A break down and cry.
Let’s call the utility functions of A U and the utility function of A’ U’ and consider a model of PA in which PA is inconsistent (such a model must exist if PA is consistent). In such a model we will have A()=D and A’()=C and so U()=5 and U’()=0. That means that A will not be able to prove that A()=D ⇒ U()=u for any u different from 5 and so either A will defect and A’ will cooperate or A will break down and cry, but A’ will not cooperate because it cannot prove A’()=C ⇒ U()=u’ for any u’ except possibly 0, so A will break down and cry. QED
More generally if M is a model of PA in which PA is inconsistent, an agent defined in this way will never be able to prove that A()=a ⇒ U()=u (where a is the first action considered in step 1) except possibly for u=u0 where u0 is the value of U() in M. That seems to create a huge problem for that approach to UDT.
A problem with “playing chicken with the universe” as an approach to UDT
Let’s consider the agent given in A model of UDT with a halting oracle. One will notice that that agent is not quite well defined because it doesn’t tell us in what order we are supposed to consider actions in step 1. But surely that doesn’t matter, right? Wrong.
Let’s consider the prisoner dilemma with payment matrix given by
and consider agent A which consider whether there is a proof that A()≠D before considering whether there is a proof that A()≠C and agent A’ which do things in the opposite order. If A or A’ is pitted against itself everything is well and mutual cooperation is the result of the game but what if A is pitted against A’? Then A break down and cry.
Let’s call the utility functions of A U and the utility function of A’ U’ and consider a model of PA in which PA is inconsistent (such a model must exist if PA is consistent). In such a model we will have A()=D and A’()=C and so U()=5 and U’()=0. That means that A will not be able to prove that A()=D ⇒ U()=u for any u different from 5 and so either A will defect and A’ will cooperate or A will break down and cry, but A’ will not cooperate because it cannot prove A’()=C ⇒ U()=u’ for any u’ except possibly 0, so A will break down and cry. QED
More generally if M is a model of PA in which PA is inconsistent, an agent defined in this way will never be able to prove that A()=a ⇒ U()=u (where a is the first action considered in step 1) except possibly for u=u0 where u0 is the value of U() in M. That seems to create a huge problem for that approach to UDT.