Wei Dai comments on A problem with “playing chicken with the universe” as an approach to UDT

Wei Dai 8 Mar 2013 21:44 UTC
0 points
Interesting. I think the other approach that cousin_it and Karl wrote about is more elegant (*) anyway, but cousin_it was dissatisfied with it because it involved an arbitrary time limit. Here’s an idea for making it less arbitrary. Suppose instead of a halting oracle, we had an oracle for General Turing Machines that (in one cycle) outputs the final output of a given GTM if it converges. Then we can use it to look through all theorems of the form A()=a ⇒ U()>=u, and return the action with the highest u, or an error (“break down and cry”) if no such action exists.

(*) More elegant because 1) it doesn’t require the “play chicken” step and 2) it doesn’t require that the set of possible actions be finite.
- Karl 8 Mar 2013 23:31 UTC
  1 point
  Parent
  If I understand what you’re proposing correctly then that doesn’t work either.
  
  Suppose it is a theorem of PA that this algorithm return an error, then all theorems of the form A()=a ⇒ U()>=u are demonstrable in PA and so there is no action with the highest u and the algorithm return an error, because that is demonstrable in PA the algorithm must return an error by Löb’s theorem.
  - Wei Dai 9 Mar 2013 5:36 UTC
    0 points
    Parent
    You’re right, and I don’t know how to fix the problem without adding the equivalent of “playing chicken”. But now I wonder why the “playing chicken” step needs to consider the possible actions one at a time. Instead suppose that using an appropriate oracle, the agent looks through all theorems of PA, and returns action a as soon as it sees a theorem of the form A()≠a, for any a. This was my original interpretation of cousin_it’s post. Does this have the same problem, if two otherwise identical agents look through the theorems of PA in different orders?
    - Karl 9 Mar 2013 16:11 UTC
      0 points
      Parent
      Well, depend how different the order is...
      
      If there are theorem in PA of the form “If there are theorems of the form A()≠a and of the form A’()≠a’ then the a and the a’ such that the corresponding theorem come first in the appropriate ordering must be identical.” then you should be okay in the prisoner dilemma setting but otherwise there will be a model of PA in which the two players end up playing different actions and we end up in the same situation as in the post.
      
      More generally, no matter how you try to cut it, there will always be a model of PA in which all theorems are provable and your agent behavior will always depend on what happen in that model because PA will never be able to prove anything which is false in that model.