“Normal” priors are about comparative value of worlds, with observations only resolving indexical uncertainty about your location among these worlds. In UDT, there is typically an assumption that an agent has excessive computational resources, and so the only purpose of observations is in resolving this indexical uncertainty. A UDT agent is working with a fixed collection of possible worlds, and it doesn’t learn anything about these worlds from observation. It devises a general strategy that is evaluated by looking how it fares at all locations that use it, across the fixed collection of possible worlds.
In contrast, logical uncertainty is not about location within the collection of possible worlds, it’s about the state of those worlds, or even about presence of specific worlds in the collection. The value of any given strategy that responds to observations would then depend on the state of logical uncertainty, and so evaluating a strategy is not as simple as taking the current epistemic state’s point of view.
A new possibility opens: some observations can communicate not just indexical information, but also logical information (alternatively, information about the state of the collection of possible worlds, not just location in the worlds of the collection). This possibility calls for something analogous to anthropic reasoning: the fact that an agent observes something tells it something about the big world, not just about which small world it’s located in. Another analogy is value uncertainty: resolving logical uncertainty essentially resolves uncertainty about agent’s utility definition (and this is another way of generating thought experiments about this issue).
So when an agent is on a branch of a strategy that indicates something new about the collection of possible worlds, the agent would evaluate the whole strategy differently from when it started out. But when it started out, it could also predict how the expected value of the strategy would look given that hypothetical observation, and also given the alternative hypothetical observations. How does it balance these possible points of view? I don’t know, but this is a new problem that breaks UDT’s assumptions, and at least to this puzzle the answer seems to be “don’t pay up”.
Our set of possible worlds comes from somewhere, some sort of criteria. Whatever generates that list passes it to our choice algorithm, which begins branching. Lets say we receive an observation that contains both Logical and Indexical updates- could we not just take our current set of possible worlds, with our current set of data on them, update the list against our logical update, and pass that list on to a new copy of the function? The collection remains fixed as far as each copy of the function is concerned, but retains the ability to update on new information. When finished, the path returned will be the most likely given all new observations.
“Normal” priors are about comparative value of worlds, with observations only resolving indexical uncertainty about your location among these worlds. In UDT, there is typically an assumption that an agent has excessive computational resources, and so the only purpose of observations is in resolving this indexical uncertainty. A UDT agent is working with a fixed collection of possible worlds, and it doesn’t learn anything about these worlds from observation. It devises a general strategy that is evaluated by looking how it fares at all locations that use it, across the fixed collection of possible worlds.
In contrast, logical uncertainty is not about location within the collection of possible worlds, it’s about the state of those worlds, or even about presence of specific worlds in the collection. The value of any given strategy that responds to observations would then depend on the state of logical uncertainty, and so evaluating a strategy is not as simple as taking the current epistemic state’s point of view.
A new possibility opens: some observations can communicate not just indexical information, but also logical information (alternatively, information about the state of the collection of possible worlds, not just location in the worlds of the collection). This possibility calls for something analogous to anthropic reasoning: the fact that an agent observes something tells it something about the big world, not just about which small world it’s located in. Another analogy is value uncertainty: resolving logical uncertainty essentially resolves uncertainty about agent’s utility definition (and this is another way of generating thought experiments about this issue).
So when an agent is on a branch of a strategy that indicates something new about the collection of possible worlds, the agent would evaluate the whole strategy differently from when it started out. But when it started out, it could also predict how the expected value of the strategy would look given that hypothetical observation, and also given the alternative hypothetical observations. How does it balance these possible points of view? I don’t know, but this is a new problem that breaks UDT’s assumptions, and at least to this puzzle the answer seems to be “don’t pay up”.
Our set of possible worlds comes from somewhere, some sort of criteria. Whatever generates that list passes it to our choice algorithm, which begins branching. Lets say we receive an observation that contains both Logical and Indexical updates- could we not just take our current set of possible worlds, with our current set of data on them, update the list against our logical update, and pass that list on to a new copy of the function? The collection remains fixed as far as each copy of the function is concerned, but retains the ability to update on new information. When finished, the path returned will be the most likely given all new observations.