Scott Garrabrant comments on What Would I Do? Self-prediction in Simple Algorithms

Scott Garrabrant 21 Jul 2020 18:24 UTC
5 points
Having $ε$ be $1 / 2^{n}$ won’t work.
Surprisingly, having $ε$ go to 0 at any quickly computably rate won’t work. For example, if $ε = 1 / n$ you could imagine having a logical induction built out of a collection of traders where one trader has almost all the money and says that on days of the form $A (m)$ , utility conditioned on going left is 0 (where $A$ is a fast growing function). Then, you have a second trader that forces the probability on day $A (m)$ of the statement that the agent goes left to be slightly higher that $1 / A (m)$ . Finally, you have a third trader that forces the expected utility conditioned on right to be very slightly above 0 on days of the form $A (m)$ .
The first trader never loses money, since the condition is never met. The second trader only loses a bounded amount of money, since it is forcing the probability of a sentence that will be false to be very small. The third trader similarly only loses a bounded amount of money. The exploration clause will never trigger, and the agent will never go left on any day of the form $1 / A (m)$ .
The issue here is that we need to not only explore infinitely often, we need to explore infinitely often on all simple subsets of days, if the probability goes to 0 slowly, you can just look at a subset of days that is sufficiently sparse.
There are ways around this that allow us to make a logical induction agent that explores with destiny 0 (meaning that the limit as $N$ goes to infinity of the proportion of days $n < N$ that the agent explores is 0). This is done by explicitly exploring infinitely often on every quickly computable subset of days, while still having the probability of exploring go to 0.