johnswentworth comments on The Presumptuous Philosopher, self-locating information, and Solomonoff induction

johnswentworth 1 Jun 2020 2:36 UTC
6 points
Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.
The problem setup doesn’t necessarily require asserting the existence of n people. It just requires setting up a universe in which n people happen to exist. That could take considerably less than K(n) bits, if person-detection is itself fairly expensive. We could even index directly to the Solomonoff inductor’s input data without attempting to recognize any agents; that would circumvent the K(number of people) issue.
- jessicata 1 Jun 2020 3:39 UTC
  7 points
  Parent
  If there’s a constant-length function mapping the universe description to the number of agents in that universe, doesn’t that mean K(n) can’t be more than the Kolmogorov complexity of the universe by more than that constant length?
  
  If it isn’t constant-length, then it seems strange to assume Solomonoff induction would posit a large objective universe, given that such positing wouldn’t help it predict its inputs efficiently (since such prediction requires locating agents).
  
  This still leads to the behavior I’m talking about in the limit; the sum of 1/2^K(n) over all n can be at most 1 so the probabilities on any particular n have to go arbitrarily small in the limit.
  - TurnTrout 1 Jun 2020 15:42 UTC
    2 points
    Parent
    If it isn’t constant-length, then it seems strange to assume Solomonoff induction would posit a large objective universe, given that such positing wouldn’t help it predict its inputs efficiently (since such prediction requires locating agents).
    but a solomonoff ind doesn’t rank hypotheses on whether they allow efficient predictions of some feature of interest, it ranks them based on posterior probabilities (prior probability + to what extent the hypothesis accurately predicted observations so far).
    - jessicata 1 Jun 2020 15:45 UTC
      2 points
      Parent
      I mean efficiently in terms of number of bits, not computation time. Which contributes to posterior probability.
  - johnswentworth 1 Jun 2020 13:59 UTC
    2 points
    Parent
    I’m about 80% on board with that argument.
    The main loophole I see is that number-of-embedded-agents may not be decidable. That would make a lot of sense, since embedded-agent-detectors are exactly the sort of thing which would help circumvent diagonalization barriers. That does run into the second part of your argument, but notice that there’s no reason we need to detect all the agents using a single program in order for the main problem setup to work. They can be addressed one-by-one, by ad-hoc programs, each encoding one of the hypotheses (world model, agent location).
    (Personally, though, I don’t expect number-of-embedded-agents to be undecidable, at least for environments with some kind of private random bit sources.)
    - jessicata 1 Jun 2020 14:48 UTC
      4 points
      Parent
      At this point it seems simplest to construct your reference class so as to only contain agents that can be found using the same procedure as yourself. Since you have to be decidable for the hypothesis to predict your observations, all others in your reference class are also decidable.
      - johnswentworth 1 Jun 2020 15:00 UTC
        4 points
        Parent
        Problem is, there isn’t necessarily a modular procedure used to identify yourself. It may just be some sort of hard-coded index. A Solomonoff inductor will reason over all possible such indices by reasoning over all programs, and throw out any which turn out to not be consistent with the data. But that behavior is packaged with the inductor, which is not itself a program.
        jessicata 1 Jun 2020 15:10 UTC
        4 points
        Parent
        Yes, I agree. “Reference class” is a property of some models, not all models.