Interesting! I’m not seeing why this phenomenon would remain when switching to LI, though. Can’t you guess, based on cryptographic assumptions or simple experience, that a particular logical coin is hard? If the situation is set up so that you can’t guess this, aren’t you right in thinking it might be easy and reasoning as such (giving some probability to the predictor figuring out the coin)?
Maybe you’re right and logical induction can quickly become confident that the coin is hard. In that case, encoding “updatelessness” seems easy. The agent should spend a small fixed amount of time choosing a successor program with high expected utility according to LI, and then run that program. That seems to solve both betting on a logical coin (where it’s best for the successor to compute the coin) and counterfactual mugging (where it’s best for the successor to pay up). Though we don’t know what shape the successor will take in general, and you could say figuring that out is the task of decision theory...
Another problem with this version of “updatelessness” is that if you only run the logical inductor a short amount of time before selecting the policy, the chosen policy could be terrible, since the early belief state of the inductor could be quite bad. It seems as if there’s at least something to be said about what it means to make a “good” trade-off between running too long before choosing the policy (so not being updateless enough) and running too short (so not knowing enough to choose policies wisely).
Interesting! I’m not seeing why this phenomenon would remain when switching to LI, though. Can’t you guess, based on cryptographic assumptions or simple experience, that a particular logical coin is hard? If the situation is set up so that you can’t guess this, aren’t you right in thinking it might be easy and reasoning as such (giving some probability to the predictor figuring out the coin)?
Maybe you’re right and logical induction can quickly become confident that the coin is hard. In that case, encoding “updatelessness” seems easy. The agent should spend a small fixed amount of time choosing a successor program with high expected utility according to LI, and then run that program. That seems to solve both betting on a logical coin (where it’s best for the successor to compute the coin) and counterfactual mugging (where it’s best for the successor to pay up). Though we don’t know what shape the successor will take in general, and you could say figuring that out is the task of decision theory...
Another problem with this version of “updatelessness” is that if you only run the logical inductor a short amount of time before selecting the policy, the chosen policy could be terrible, since the early belief state of the inductor could be quite bad. It seems as if there’s at least something to be said about what it means to make a “good” trade-off between running too long before choosing the policy (so not being updateless enough) and running too short (so not knowing enough to choose policies wisely).