Vladimir_Nesov comments on A model of UDT with a halting oracle

Vladimir_Nesov 18 Dec 2011 15:55 UTC
3 points

In contrast, the new model with oracles has a nice notion of optimality, relative to the agent’s formal system.

Specifically, given any formal system S for reasoning about the world and agent’s place in it, the chicken rule (step 1) forces S to generate consistent theories of consequences for all possible actions. This seems to crack a long-standing problem in counterfactual reasoning, giving a construction for counterfactual worlds (in form of consistent formal theories) from any formal theory that has actual world as a model.
What links here?
- Vladimir_Nesov's comment on Some thoughts on AI, Philosophy, and Safety by paulfchristiano (26 Dec 2011 13:40 UTC; 0 points)
- Vladimir_Nesov 21 Dec 2011 15:51 UTC
  5 points
  Parent
  ...and the construction turns out not as interesting as I suspected. Something like this is very easy to carry out by replacing the agent A with another that can’t be understood in S, but is equivalent to A (according to a system stronger than S). As a tool for understanding decision problems, this is intended to solve the problem of parsing the world in terms of A, finding how it depends on A, finding where A is located in the world, but if we can find all instances of A in the world to perform such surgery on them, we’ve already solved the problem!
  
  Perhaps A can decide to make itself incomprehensible to itself (to any given S, rather), thus performing the transformation without surgery, formalization of free will by an act of then-mysterious free will? This could still be done. But it’s not clear if this can be done “from the outside”, where we don’t have the power of making A transform to make the dependence of the world on its actions clear.