[Question] Is there any work on incorporating aleatoric uncertainty and/or inherent randomness into AIXI?

David Scott Krueger (formerly: capybaralet)4 Oct 2020 8:10 UTC

9 points

IIUC, AIXI assumes the environment is deterministic.
In other words, it only has epistemic uncertainty.
What if it didn’t make that assumption and/or the assumption was violated?
Has anyone explored this question?

David Scott Krueger (formerly: capybaralet)4 Oct 2020 8:10 UTC

9 points

7 comments1 min readLW link

Kaj_Sotala 5 Oct 2020 11:47 UTC
4 points
Marcus Hutter’s “Universal Algorithmic Intelligence: A mathematical top->down approach” has this in section 2.4.:
Let us now weaken our assumptions by replacing the deterministic environment q with a probability distribution µ(q) over chronological functions. Here µ might be interpreted in two ways. Either the environment itself behaves stochastically defined by µ or the true environment is deterministic, but we only have subjective (probabilistic) information of which environment is the true environment. Combinations of both cases are also possible. We assume here that µ is known and describes the true stochastic behavior of the environment. The case of unknown µ with the agent having some beliefs about the environment lies at the heart of the AIξ model described in Section 4.
The best or most intelligent agent is now the one that maximizes the expected utility (called value function) $V_{µ}^{p} \equiv V_{1 m}^{p µ} := \sum_{q} µ (q) V_{1 m}^{p q}$ . This defines the AIµ model.
If I’m skimming the document correctly (I haven’t read it in any detail), building up the AIµ model is part of later turning it into the AIξ model, which is AIXI. From the end of the section:
To get our final universal AI model the idea is to replace µ by the universal probability ξ, defined later.
And section 4:
The main idea of this work is to generalize universal induction to the general agent model described in Section 2. For this, we generalize ξ to include actions as conditions and replace µ by ξ in the rational agent model, resulting in the AIξ(=AIXI) model. In this way the problem that the true prior probability µ is usually unknown is solved. Convergence of ξ→µ can be shown, indicating that the AIξ model could behave optimally in any computable but unknown environment with reinforcement feedback.
Charlie Steiner 4 Oct 2020 10:35 UTC
3 points
Yes. If you generate a bit sequence by flipping a coin, then with high probability AIXI will throw up its hands and say “you can’t model this any better than just recording the sequence, therefore the next bit is ⁵⁰⁄₅₀.”

With slight complications, similar arguments apply no matter what distribution you draw the environment from, so that the random part correctly gets modeled like a random variable drawn from the right distribution.
- gwern 4 Oct 2020 18:09 UTC
  4 points
  Parent
  Couldn’t you just treat any ‘stochastic’ environment as hidden-variable theories—actually being a deterministic program with a PRNG appended whose seed you don’t know?
  - Charlie Steiner 5 Oct 2020 3:18 UTC
    2 points
    Parent
    Yes, this is basically what I’m saying—treating the future as random, versus treating the future as encrypted by a one-time pad you don’t know, lead to the same distributions and behavior.
    This means that if you want to think of Solomonoff induction in terms of random variables, you can, but it turns out that you get back something that’s still equivalent to Solomonoff induction.
  - David Scott Krueger (formerly: capybaralet) 5 Oct 2020 7:03 UTC
    1 point
    Parent
    Yeah that seems right. But I’m not aware of any such work OTTMH.
- David Scott Krueger (formerly: capybaralet) 5 Oct 2020 7:11 UTC
  1 point
  Parent
  Is there a reference for this?
  
  I was inspired to think of this by this puzzle (which I interpret as being about the distinction between epistemic and aleatoric uncertainty):
  
  ”“”
  ”To present another example, suppose that five tosses of a given coin are planned and that the agent has equal strength of belief for two outcomes, both beginning with H, say the outcomes HTTHT and HHTTH. Suppose the first toss is made, and results in a head. If all that the agent learns is that a head occurred on the first toss it seems unreasonable for him to move to a greater confidence in the occurrence of one sequence rather than another. The only thing he has found out is something which is logically implied by both propositions, and hence, it seems plausible to say, fails to differentiate between them.
  This second example might be challenged along the following lines: The case might be one in which initially the agent is moderately confident that the coin is either biased toward heads or toward tails. But he has as strong a belief that the bias is the one way as the other. So initially he has the same degree of confidence that H will occur as that T will occur on any given toss, and so, by symmetry considerations, an equal degree of confidence in HTTHT and HHTTH. Now if H is observed on the first toss it is reasonable for the agent to have slightly more confidence that the coin is biased toward heads than toward tails. And if so it might seem he now should have more confidence that the sequence should conclude with the results HTTH than TTHT because the first of these sequence has more heads in it than tails.”
  Which is right?
  ”″”
  
  What’s striking to me is that the 2nd argument seems clearly correct, but only seems to work if you make a distinction between epistemic and aleatoric uncertainty, which I don’t think AIXI does. So that makes me wonder if it’s doing something wrong (or if people who use Beta distributions to model coin flips are(!!))
  - Charlie Steiner 5 Oct 2020 9:01 UTC
    3 points
    Parent
    I don’t have a copy of Li and Vitanyi on hand, so I can’t give you a specific section, but it’s in there somewhere (probably Ch. 3). By “it” here I mean discussion of what happens to Solomonoff induction if we treat the environment as being drawn from a distribution (i.e. having “inherent” randomness).
    Neat puzzle! Let’s do the math real quick:
    Suppose you have one coin with bias 0.1, and another with bias 0.9. You choose one coin at random and flip it a few times.
    Before flipping, flipping 3 H and 2 T seems just as likely as flipping 2 H and 3 T, no matter the order. P(HHHTT)= P(HHTTT) = $(0.5 \times {0.9}^{3} \times {0.1}^{2}) + (0.5 \times {0.9}^{2} \times {0.1}^{3})$ = 0.00405
    After your first flip, you notice that it’s a H. You now update your probability that you grabbed the heads-biased coin: P(heads bias|H) = $0.5 \times \frac{0.9}{0.5}$ = 0.9.
    Now P(HHTT|H) = $(0.9 \times {0.9}^{2} \times {0.1}^{2}) + (0.1 \times {0.9}^{2} \times {0.1}^{2})$ = 0.0081
    And P(HTTT|H) = $(0.1 \times {0.9}^{3} \times 0.1) + (0.9 \times 0.9 \times {0.1}^{3})$ = 0.0081.
    Huh, that’s weird.
    That’s, like, super unintuitive.
    But if you look at the terms for P(HHTT|H) and P(HTTT|H), notice that they both simplify to $({0.9}^{3} \times {0.1}^{2}) + ({0.9}^{2} \times {0.1}^{3})$ . You think it’s more likely that you have the heads-biased coin, but because you know the coin must be biased, the further sequence “HHTT” isn’t as likely as the sequence “HTTT”, and both this difference in likelihood and your probability of what coin you have are the same number, the bias of the coin!

No comments.

[Question] Is there any work on incorporating aleatoric uncertainty and/​or inherent randomness into AIXI?

[Question] Is there any work on incorporating aleatoric uncertainty and/or inherent randomness into AIXI?