Blurgh. That’s one of the most symbol-dense papers I’ve ever seen.
I know. It seems that this is Hutter’s usual style.
But from what I can tell, it specifies that AIXI will eventually reason its way out of any finite Pascal’s Mugging. The higher the hypothesized reward in the Mugging, the longer it will take to converge away from the Mugging, but each failure of reality to conform to the Mugging will push down the probability of that environment being true and thus reduce the expected value of acting according to it. Asymptotic convergence is proven.
Well, the Pascal’s Mugging issue essentially boils down to whether the agent decision making is dominated by the bias in its prior. Clearly, an agent that has seen little or no sensory input can’t have possibly learned anything, and is therefore dominated by its bias. What Hutter proved is that, for reasonable classes of environments, the agent eventually overcomes its bias. There is of course the interesting question of convergence speed, which is not addressed in that paper.
I’d also bet that Hutter’s formalism might consider large rewards generated for little reason to be not merely complex because of “little reason”, but actually to have greater Kolmogorov Complexity just because large rewards are more complex than small ones. Possibly. So there would be a question of whether the reward of a Mugging grows faster than its probability shrinks, in the limit. Eliezer claims it does, but the question is whether our convenient notations for very, very, very large numbers actually imply some kind of simplicity for those numbers or whether we’re hiding complexity in our brains at that point.
Note that in Hutter’s formalism rewards are bounded between 0 and some r_max. That’s no accident, since if you allow unbounded rewards, the expectation can diverge. Yudkowsky seems to assume unbounded rewards. I think that if you tried to formalize his argument, you would end up attempting to compare infinities. If rewards are bounded, the bias introduced by the fact that the contributions to the expectations from the tail of the distributions don’t exactly cancel out over different actions is eventually washed away as more evidence accumulates.
It seems downright obvious that “3” ought be considered vastly more simple than “3 ^^^ 3″. How large a Turing Machine does it take to write down the fullest expansion of the recursive function for that super-exponentiation?
It’s not really very large.
The point is that there are computable functions that grow faster than exponential. The Solomonoff prior over natural numbers (or any set of computable numbers with an infimum and and not a supremum) has infinite expectation because of the contribution of these functions. (If the set has neither an infimum nor a supremum, I think that the expectation may be finite, positively infinite or negatively infinite depending on the choice of the universal Turing machine and the number encoding)
I’ve been discussing this whole thing on Reddit, in parallel, and I think this is the point where I would just give up and say: revert to evidentialism when discussing unbounded potential rewards. Any hypothesis with a plausibility (ie: my quantity of belief equals its prior, no evidence accumulated) rather than a probability (ie: priors plus evidence) nulls out to zero and is not allowed to contribute to expected-utility calculations.
(Actually, what does Bayesian reasoning look like if you separate priors from evidence and consider an empty set of evidence to contribute a multiplier of 0.0, thus exactly nulling out all theories that consist of no evidence but their priors?)
I know. It seems that this is Hutter’s usual style.
Well, the Pascal’s Mugging issue essentially boils down to whether the agent decision making is dominated by the bias in its prior.
Clearly, an agent that has seen little or no sensory input can’t have possibly learned anything, and is therefore dominated by its bias. What Hutter proved is that, for reasonable classes of environments, the agent eventually overcomes its bias. There is of course the interesting question of convergence speed, which is not addressed in that paper.
Note that in Hutter’s formalism rewards are bounded between 0 and some r_max. That’s no accident, since if you allow unbounded rewards, the expectation can diverge.
Yudkowsky seems to assume unbounded rewards. I think that if you tried to formalize his argument, you would end up attempting to compare infinities.
If rewards are bounded, the bias introduced by the fact that the contributions to the expectations from the tail of the distributions don’t exactly cancel out over different actions is eventually washed away as more evidence accumulates.
It’s not really very large.
The point is that there are computable functions that grow faster than exponential. The Solomonoff prior over natural numbers (or any set of computable numbers with an infimum and and not a supremum) has infinite expectation because of the contribution of these functions.
(If the set has neither an infimum nor a supremum, I think that the expectation may be finite, positively infinite or negatively infinite depending on the choice of the universal Turing machine and the number encoding)
I’ve been discussing this whole thing on Reddit, in parallel, and I think this is the point where I would just give up and say: revert to evidentialism when discussing unbounded potential rewards. Any hypothesis with a plausibility (ie: my quantity of belief equals its prior, no evidence accumulated) rather than a probability (ie: priors plus evidence) nulls out to zero and is not allowed to contribute to expected-utility calculations.
(Actually, what does Bayesian reasoning look like if you separate priors from evidence and consider an empty set of evidence to contribute a multiplier of 0.0, thus exactly nulling out all theories that consist of no evidence but their priors?)