Note that Bayesian updating is not done explicitly in this decision theory. When the decision algorithm receives input X, it may determine that a subset of programs it has preferences about never calls it with X and are also logically independent of its output, and therefore it can safely ignore them when computing the consequences of a choice. There is no need to set the probabilities of those programs to 0 and renormalize.
But does the Bayesian update occur if the input X affects the relative probabilities of the programs without setting any of these probabilities to 0? If it doesn’t, why not, and how is this change in the distribution over P_i’s taken into account?
ETA: Is the following correct?
If there is only one possible program (P), then there is no need for anything like Baysian updating, you can just look directly into the program and find the output Y that maximizes utility. When there are multiple possible programs then something like Bayesian updating needs to occur to take into account the effect of outputing Y1 over Y2. This is done implicitly when maximizing Sum P_Y1() U() since the probability distribution over the Ei’s depends on Y.
If all that’s correct, how do you get this distribution?
Sorry, following the SIAI decision theory workshop, I’ve been working with some of the participants to write a better formulation of UDT to avoid some of the problems that were pointed out during the workshop. It’s a bit hard for me to switch back to thinking about the old formulation and try to explain that, so you might want to wait a bit for the “new version” to come out.
Be sure to consider the possibility of the worlds spontaneously constructing the agent in some epistemic state, or dissolving it. Also, when a (different) agent thinks about our agent, it might access a statement about the agent’s strategy that involves many different epistemic states. For this reason, the agent’s strategy controls many more worlds than where the agent is instantiated “normally”. This makes the problem of figuring out which of the world programs contain the agent very non-trivial, depending on what state of the agent are we talking about, and what kind of worlds are we considering, and not just by the order in which the agent program expects observations.
These considerations made me write off Bayesian updating as a non-fundamental technique that shouldn’t be shoehorned into a more general decision theory for working with arbitrary preference. I currently suspect that there is no generally applicable simple trick, and FAI decision theory should instead seek to clarify the conceptual issues, and then work on optimizing brute force algorithms that follow from that picture. Think abstract interpretation, not variational mean field.
Understanding check:
But does the Bayesian update occur if the input X affects the relative probabilities of the programs without setting any of these probabilities to 0? If it doesn’t, why not, and how is this change in the distribution over P_i’s taken into account?
ETA: Is the following correct?
If there is only one possible program (P), then there is no need for anything like Baysian updating, you can just look directly into the program and find the output Y that maximizes utility. When there are multiple possible programs then something like Bayesian updating needs to occur to take into account the effect of outputing Y1 over Y2. This is done implicitly when maximizing Sum P_Y1() U() since the probability distribution over the Ei’s depends on Y.
If all that’s correct, how do you get this distribution?
Sorry, following the SIAI decision theory workshop, I’ve been working with some of the participants to write a better formulation of UDT to avoid some of the problems that were pointed out during the workshop. It’s a bit hard for me to switch back to thinking about the old formulation and try to explain that, so you might want to wait a bit for the “new version” to come out.
Be sure to consider the possibility of the worlds spontaneously constructing the agent in some epistemic state, or dissolving it. Also, when a (different) agent thinks about our agent, it might access a statement about the agent’s strategy that involves many different epistemic states. For this reason, the agent’s strategy controls many more worlds than where the agent is instantiated “normally”. This makes the problem of figuring out which of the world programs contain the agent very non-trivial, depending on what state of the agent are we talking about, and what kind of worlds are we considering, and not just by the order in which the agent program expects observations.
These considerations made me write off Bayesian updating as a non-fundamental technique that shouldn’t be shoehorned into a more general decision theory for working with arbitrary preference. I currently suspect that there is no generally applicable simple trick, and FAI decision theory should instead seek to clarify the conceptual issues, and then work on optimizing brute force algorithms that follow from that picture. Think abstract interpretation, not variational mean field.
I look forward to it.
I should probably be studying for a linear models exam tomorrow anyway...