ocr-fork comments on Metaphilosophical Mysteries

ocr-fork 29 Jul 2010 22:02 UTC
0 points

Umm… what about my argument that a human can represent their predictions symbolically like “P(next bit is 1)=i-th bit of BB(100)” instead of using numerals, and thereby do better than a Solomonoff predictor because the Solomonoff predictor can’t incorporate this?

BB(100) is computable. Am I missing something?
- Wei Dai 29 Jul 2010 22:11 UTC
  1 point
  Parent
  
  BB(100) is computable. Am I missing something?
  
  Maybe… by BB I mean the Busy Beaver function Σ as defined in this Wikipedia entry.
  - ocr-fork 29 Jul 2010 22:14 UTC
    0 points
    Parent
    Right, and...
    
    A trivial but noteworthy fact is that every finite sequence of Σ values, such as Σ(0), Σ(1), Σ(2), …, Σ(n) for any given n, is computable, even though the infinite sequence Σ is not computable (see computable function examples.
    
    So why can’t the universal prior use it?
    - Wei Dai 29 Jul 2010 22:30 UTC
      5 points
      Parent
      Sorry, I should have used BB(2^100) as the example. The universal prior assigns the number BB(2^100) a very small weight, because the only way to represent it computably is by giving a 2^100 state Turing machine. A human would assign it a much larger weight, referencing it by its short symbolic representation.
      
      Until I write up a better argument, you might want to (assuming you haven’t already) read this post where I gave a decision problem that a human does (infinitely) better than AIXI.
      - LucasSloan 29 Jul 2010 22:41 UTC
        2 points
        Parent
        I don’t think I understood that fully, but there seems to be a problem with your theory. The human gets to start in the epistemically advantaged position of knowing that the game is based on a sequence of busy beavers and knowing that they are a very fast growing function. AIXI is prevent from knowing this information and has to start as if from a blank canvas. The reason we use a Occamian prior for AIXI is because we refuse to tailor it to a specific environment, if your logic is sound, then yes, it does do worse when it is dropped into an environment where it is paired with a human with an epistemic advantage, but it would beat the human across the space of possible worlds.
        
        Another problem you seem to have is to assume that the only hypothesis in the entire set that gives useful predictions is the hypothesis which is, in fact, correct. There are plenty of other function which correctly predict arbitrarily large numbers of 1′s, with much less complexity, which can give the overall probability weighting that AIXI is using a usefully correct model of its universe, if not a fully correct one.
        Wei Dai 29 Jul 2010 22:56 UTC
        4 points
        Parent
        How a human might come to believe, without being epistemically privileged, that a sequence is probably a sequence of busy beavers, is a deep problem, similar to the problem of distinguishing halting oracles from impostors. (At least one mathematical logician who has thought deeply about the latter problem thinks that it’s doable.)
        
        But in any case, the usual justification for AIXI (or adopting the universal prior) is that (asymptotically) it does as well as or better than any computable agent, even one that is epistemically privileged, as long as the environment is computable. Eliezer and others were claiming that it does as well as or better than any computable agent, even if the environment is not computable, and this is what my counter-example disproves.
        LucasSloan 29 Jul 2010 23:43 UTC
        0 points
        Parent
        So you think that we need to rethink our theory of what perfect optimization is, in order to take into account the possibility we live in an uncomputable universe? Even if you are correct in your example, there is no reason to suppose that your human does better in the space of possible uncomputable universes than AIXI, as opposed to better in that one possible (impossible) universe.
        Wei Dai 30 Jul 2010 0:30 UTC
        0 points
        Parent
        
        So you think that we need to rethink our theory of what perfect optimization is, in order to take into account the possibility we live in an uncomputable universe?
        
        Yes.
        
        Even if you are correct in your example, there is no reason to suppose that your human does better in the space of possible uncomputable universes than AIXI, as opposed to better in that one possible (impossible) universe.
        
        This seems pretty easy, given the same level of raw computing power available to AIXI (otherwise the human gets screwed in the majority of cases simply because he doesn’t have enough computing power).
        
        For example, I can simply modify AIXI with a rule that says “if you’ve seen a sequence of increasingly large numbers that can’t be explained by any short computable rule, put some weight into it being BB(1)...BB(2^n)… (and also modify it to reasoning symbolically about expected utilities instead of comparing numbers) and that will surely be an improvement over all possible uncomputable universes. (ETA: Strike that “surely”. I have to think this over more carefully.)
        
        How to make an optimal decision algorithm (as opposed to just improving upon AIXI) is still an open problem.
        LucasSloan 30 Jul 2010 1:33 UTC
        0 points
        Parent
        
        For example, I can simply modify AIXI with a rule that says “if you’ve seen a sequence of increasingly large numbers that can’t be explained by any short computable rule, put some weight into it being BB(1)...BB(2^n)… (and also modify it to reasoning symbolically about expected utilities instead of comparing numbers) and that will surely be an improvement over all possible uncomputable universes. (ETA: Strike that “surely”. I have to think this over more carefully.)
        
        This is what I dislike about your logic. You create a situation where (you think) AIXI fails, but you fail to take into account the likelihood of being in the situation versus being in a similar situation. I can easily see a human seeing a long series of ones, with some zeros at the beginning, saying “ah-ha, this must be the result of a sequence of busy beavers”, when all he’s actually seeing is 3^^^3 minus his telephone number or something. AIXI can lose in really improbable universes, because it’s designed to work in the space of universes, not some particular one. By modifying the rules, you can make it better in specific universes, but only by reducing its performance in similar seeming universes.
        ocr-fork 29 Jul 2010 23:12 UTC
        0 points
        Parent
        
        What about the agent using Solomonoff’s distribution? After seeing BB(1),...,BB(2^n), the algorithmic complexity of BB(1),...,BB(2^n) is sunk, so to speak. It will predict a higher expected payoff for playing 0 in any round i where the conditional complexity K(i | BB(1),...,BB(2^n)) < 100. This includes for example 2BB(2^n), 2BB(2^n)+1, BB(2^n)^2 * 3 + 4, BB(2^n)^^^3, etc. It will bet on 0 in these rounds (erroneously, since K(BB(2^(n+1)) | BB(2^n)) > 100 for large n), and therefore lose relative to a human.
        
        I don’t understand how the bolded part follows. The best explanation by round BB(2^n) would be “All 1′s except for the Busy Beaver numbers up to 2^n”, right?
        Wei Dai 29 Jul 2010 23:35 UTC
        1 point
        Parent
        Yes, that’s the most probable explanation according to the Solomonoff prior, but AIXI doesn’t just use the most probable explanation to make decisions, it uses all computable explanations that haven’t been contradicted by its input yet. For example, “All 1′s except for the Busy Beaver numbers up to 2^n and 2BB(2^n)” is only slightly less likely than “All 1′s except for the Busy Beaver numbers up to 2^n” and is compatible with its input so far. The conditional probability of that explanation, given what it has seen, is high enough that it would bet on 0 at round 2BB(2^n), whereas the human wouldn’t.
        ocr-fork 30 Jul 2010 0:23 UTC
        0 points
        Parent
        Oh.
        
        I feel stupid now.
        
        EDIT: Wouldn’t it also break even by predicting the next Busy Beaver number? “All 1′s except for BB(1...2^n+1)” is also only slightly less likely. EDIT: I feel more stupid.
        Wei Dai 30 Jul 2010 0:40 UTC
        0 points
        Parent
        The next number in the sequence is BB(2^(n+1)), not BB(2^n+1).
        
        ETA: In case more explanation is needed, it takes O(2^n) more bits to computably describe BB(2^(n+1)), even if you already have BB(2^n). (It might take O(2^n) more bits to describe BB(2^n+1) as well, but I wasn’t sure so I used BB(2^(n+1)) in my example instead.)
        
        Since K(BB(2^(n+1)) | BB(2^n)) > 100 for large n, AIXI actually will not bet on 0 when BB(2^(n+1) comes around, and all those 0s that it does bet on are simply “wasted”.
        ocr-fork 30 Jul 2010 1:28 UTC
        0 points
        Parent
        
        it might take O(2^n) more bits to describe BB(2^n+1) as well, but I wasn’t sure so I used BB(2^(n+1)) in my example instead.
        
        You can find it by emulating the Busy Beaver.
      - timtyler 31 Jul 2010 22:13 UTC
        −1 points
        Parent
        BB(100) is computable—and BB(2^100) is computable too :-(