paulfchristiano comments on Oracle machines instead of topological truth predicates

paulfchristiano 10 Nov 2014 18:26 UTC
LW: 1 AF: 1
AF
Do we really need the constraint M[O’] halts with probability 1 for each O’? I remember the results going through anyway, if we just allow O(M) to be arbitrary with whatever probability M[O] doesn’t halt. This actually doesn’t even require Kakutani, we can just use Brouwer’s.

The idea is that P(STATE[O] outputs 1) is a continuous function of P(O[X] outputs 1), P(STATE0[O] outputs 1), and P(STATE1[O] outputs 1), where X is the oracle query that STATE makes in the next time step (if any), STATE0 is the resulting state after the oracle returns 0, and STATE1 is the resulting state after the oracle returns 1. So given one value for O, we can define the new oracle f(O) using this update step. Because f(O)(STATE) depends only on O(X), O(STATE0), and O(STATE1), f is a continuous function. So we get a fixed point.

I guess one reason to dislike this version is that it can’t be used in such a straightforward way as part of AIXI, without some further restriction, such as some upper bound on the computational complexity of the world (perhaps exponentially larger than the computational complexity of the agent).

If this application is the point of the restriction that M[O’] halts with probability 1 for all O’, then it might be reasonable to more explicitly call that out. It seems like a simpler notion of reflective oracles might find use elsewhere. For example, it seems like most practicing computer scientists (at least in the US) don’t much care about the difference between “infinite” and “doubly exponential.” In some contexts (esp. when you have access to a halting oracle) the infinite is important because it is serving as an analogy for the computationally intractable. But I don’t think that’s the case here (because we are considering “doubly exponential relative to O,” which is already intractable for a polynomial time agent relative to O).
What links here?
- Multibit reflective oracles by Benya_Fallenstein (25 Jan 2015 2:23 UTC; 5 points)
- Probabilistic Oracle Machines and Nash Equilibria by jessicata (6 Feb 2015 1:14 UTC; 5 points)
- Benya_Fallenstein 10 Nov 2014 21:56 UTC
  0 points
  AF Parent
  Thanks for expanding on your construction! I hadn’t thought of the recursive construction, that’s really neat.
  
  I’m not that worried about the application to AIXI: unless I’m missing something, we can just additionally give our machines access to a double halting oracle (for ordinary Turing machines), and recover the same power. I considered doing that and didn’t go with it because the stuff in my post seemed slightly more elegant, but if there’s a reason to prefer the other version, it seems fine to use it.
  
  I’m not clear on what you mean by “ $P (O [X] outputs 1)$ ”; I’m guessing you were sloppy in talking about $O (┌ X ┐)$ rather than my $O (┌ X ┐, p)$ or some other variant, but I may be missing something. The $O (┌ X ┐, p)$ version seems to need Kakutani unless we want to allow some $ε$ -tolerance: consider $f (O) (┌ O ⟨ ┌ X ┐, p ⟩ ┐, q)$ for arbitrary $(┌ X ┐, p)$ and for $0 < q < 1$ ; we want this to equal $0$ for all $O$ such that $O (┌ X ┐, p) > q$ and to equal $1$ for all $O$ such that $O (┌ X ┐, p) < q$ , which makes $f$ discontinuous if it’s single-valued. My intuition is that we generally need Kakutani to get exact optimality rather than $ε$ optimality (as a rule of thumb, not a hard-and-fast rule), because which choice is optimal jumps discontinuously.
  
  In the $O (┌ X ┐, p)$ version, continuity is not as immediate as if we could determine $f (O) (┌ X ┐, p)$ by looking at only three values of $O (\cdot, \cdot)$ . However, we can use the same trick as in my post on simplicity priors: Abbreviate the probability that the oracle will say “greater” or “less” when called on $(┌ {S T A T E}_{1} ┐, p)$ as $g_{p} := O (┌ {S T A T E}_{1} ┐, p)$ and $l_{p} := (1 - g_{p})$ , respectively; then, the probability that a reflective $O$ assigns to ${S T A T E}_{1} [O]$ halting equals
  
  $\frac{1}{2} g_{.5}$ $+$ $\frac{1}{4} (l_{.5} g_{.25} + g_{.5} g_{.75})$ $+$ $\frac{1}{8} (l_{.5} (l_{.25} g_{.125} + g_{.25} g_{.375}) + g_{.5} (l_{.75} g_{.625} + g_{.75} g_{.875}))$ $+$ $\dots$ .
  
  This is clearly a continuous function of $O$ , since later calls to $O$ are multiplied with smaller and smaller weights. Thus, we can use this expression and the analogous one for ${S T A T E}_{0}$ in the definition of $f (O)$ .
  
  We can then check that a fixed point $O \in f (O)$ of this correspondence is a reflective oracle by showing inductively that for every $T$ , if the probability that $X [O]$ halts in $T$ steps and outputs $1$ is greater than $p$ then $O (┌ X ┐, p) = 1$ and if the probability that it halts in $T$ steps and outputs $0$ is less than $p$ , then $O (┌ X ┐, p) = 0$ .
  
  Tangential: I’m now thinking that it would be more elegant not to speak about halting; rather, use write-and-advance-only output tapes, and talk about the probability that the first bit $X$ outputs is a $1$ or a $0$ , with “never outputs anything” as the third possibility(whether due to looping or halting without output). Interested in whether or not this presentation would seem natural to theoretical computer scientists. (Although still interested in comments on the notation $O (\cdot, \cdot)$ for the probabilities of the oracle’s outputs.)
  What links here?
  - Benya_Fallenstein's comment on Approximability by abramdemski (21 Nov 2014 3:14 UTC; 1 point)