gjm comments on Frequentist Magic vs. Bayesian Magic

gjm 10 Apr 2010 0:05 UTC
5 points
You get completely different results by doing that. The universal prior gives higher probability to simpler hypotheses -- 0000000000000000 is more likely than 0010111001001101, or at least analogous things are true once the strings get long enough. The idea is that if there’s any computable regularity in what you’re trying to model, as you get more data you’ll recognize that regularity (i.e., your posterior will give much higher probability to futures in which the regularity continues than to ones in which it doesn’t).
- RolfAndreassen 11 Apr 2010 22:41 UTC
  1 point
  Parent
  Ah so, I understand. But in that case, doesn’t evaluating your prior require you to integrate over the space of all possible Turing machines? There is no conceptual difficulty with this, but I must say that it strikes me as computationally rather formidable for a poxy little throw-four-possibly-biased-coins problem. :)
  - gjm 12 Apr 2010 18:56 UTC
    0 points
    Parent
    It’s not just computationally formidable, it’s computationally impossible. :-)
    - gwern 12 Apr 2010 19:35 UTC
      2 points
      Parent
      My understanding was that it’s perfectly possible to solve the Halting Problem for particular TMs.
      
      For a TM of length n, you begin running it and either it terminates at some point, or it runs for BusyBeaver(n)+1 steps, in which case you know it doesn’t ever halt.
      
      So as long as you have a fixed length n that all possible TMs fit into and know the Busy Beaver for n, you can in fact use the universal prior. (Busy Beavers are uncomputable in general, though we know a fair number, but this is an easy condition to weaken down to something practical like high probability of being the Busy Beaver.)
      - gjm 13 Apr 2010 0:52 UTC
        1 point
        Parent
        The fact that BB(n) is uncomputable means that your procedure doesn’t, in fact, work. No?
        
        You don’t, in any case, have a fixed n. (Unless perhaps you are only interested in modelling Turing machines that fit in the universe—but in that case, wouldn’t you also want your own computations to fit in the universe?)
        
        gwern 13 Apr 2010 1:16 UTC
        1 point
        Parent
        Finding BB(n) for all n is uncomputable. Finding BB(n) where n=10, say, is not uncomputable.
        Yeah, this is an interesting point. My thinking is that yes, we do only care about modeling Turing machines that fit in the universe, since by definition those are the only Turing machines whose output we will ever observe. We might also have to bite the bullet and say we only care about predicting Turing machines which are small enough that there is room left in the universe for us; we don’t want to perfectly model the entire universe, just small parts.
        gjm 15 Apr 2010 20:00 UTC
        3 points
        Parent
        If it’s uncomputable in general then it’s uncomputable for some particular n. I expect that in fact it’s uncomputable for any n above some rather small value. (In particular, once n is large enough for there to be n-state Turing machines that unprovably don’t always halt.)
        
        Well, if you only care about predicting smallish Turing machines, then for the same reason prediction is only useful if it can be done with smallish machines, in reasonable time. And even for very small n this isn’t true for calculating BB(n).
        
        RobinZ 15 Apr 2010 20:28 UTC
        1 point
        Parent
        Just out of my own curiosity, I checked Wikipedia on Busy Beaver numbers: BB(10) > 3^^^3.
        
        While the article admits that every finite sequence (such as n = 1...10) is computable, I think it is safe to call it intractable right now.
        gwern 15 Apr 2010 22:11 UTC
        3 points
        Parent
        Hey, don’t laugh. That’s useful to know—now you know any Turing Machine less than 10 is bluffing when it threatens 3^^^^3 specks of sand in eyes.
        thomblake 15 Apr 2010 22:30 UTC
        2 points
        Parent
        Off the cuff, couldn’t it actually accomplish that if it just doesn’t terminate?
        gwern 15 Apr 2010 23:38 UTC
        1 point
        Parent
        If it doesn’t terminate, then I think its threat would be an infinite number of sand specks, wouldn’t it?
        JGWeissman 15 Apr 2010 23:46 UTC
        0 points
        Parent
        
        now you know any Turing Machine less than 10 is bluffing when it threatens 3^^^^3 specks of sand in eyes.
        
        How did you conclude that from a lower bound?
        gwern 16 Apr 2010 2:17 UTC
        3 points
        Parent
        I don’t think the lower-bound will turn out to be all that off, and I tossed in another ^ as insurance. Jests don’t have to be air-tight.
    - RolfAndreassen 12 Apr 2010 19:10 UTC
      0 points
      Parent
      I don’t think so, provided you limit yourself to Turing machines constructed of at most a universe’s worth of energy and have unlimited time for the computation. However, that said, invoking such a heavy-duty prior seems to me a bit of a cop-out; you might almost as well invoke “My prior is what Omega told me”. For choosing between Bayesian and frequentist approaches regarding actual problems in the real brain space available to humans, I prefer priors that can be computed in something like a human lifetime. To show that a Bayesian with a prior that would take multiple billion years to compute can do better than a frequentist is not really a very strong result.