Omega inspires belief only after the agent encounters Omega.
According to UDT, the agent should not update its policy based on this encounter; it should simply follow it.
Thus the agent should act according to whatever the best policy is, according to its original (e.g. universal) prior from before it encountered Omega (or indeed learned anything about the world).
I think either:
the agent does update, in which case, why not update on the result of the coin-flip?
or
the agent doesn’t update, in which case, what matters is simply the optimal policy given the original prior.
I reason as follows:
Omega inspires belief only after the agent encounters Omega.
According to UDT, the agent should not update its policy based on this encounter; it should simply follow it.
Thus the agent should act according to whatever the best policy is, according to its original (e.g. universal) prior from before it encountered Omega (or indeed learned anything about the world).
I think either:
the agent does update, in which case, why not update on the result of the coin-flip? or
the agent doesn’t update, in which case, what matters is simply the optimal policy given the original prior.