It seems like trapped priors and commitment races are exactly the sort of cognitive dysfunction that updatelessness would solve in generality.
My understanding is that trapped priors are a symptom of a dysfunctional epistemology, which over-weights prior beliefs when updating on new observations. This results in an agent getting stuck, or even getting more and more confident in their initial position, regardless of what observations they actually make.
Similarly, commitment races are the result of dysfunctional reasoning that regards accurate information about other agents as hazardous. It seems like the consensus is that updatelessness is the general solution to infohazards.
My current model of an “updateless decision procedure”, approximated on a real computer, is something like “a policy which is continuously optimized, as an agent has more time to think, and the agent always acts according to the best policy it’s found so far.” And I like the model you use in your report, where an ecosystem of participants collectively optimize a data structure used to make decisions.
Since updateless agents use a fixed optimization criterion for evaluating policies, we can use something like an optimization market to optimize an agent’s policy. It seems easy to code up traders that identify “policies produced by (approximations of) Bayesian reasoning”, which I suspect won’t be subject to trapped priors.
So updateless agents seem like they should be able to do at least as well as updateful agents. Because they can identify updateful policies, and use those if they seem optimal. But they can also use different reasoning to identify policies like “pay Paul Ekman to drive you out of the desert”, and automatically adopt those when they lead to higher EV than updateful policies.
I suspect that the generalization of updatelessness to multi-agent scenarios will involve optimizing over the joint policy space, using a social choice theory to score joint policies. If agents agree at the meta level about “how conflicts of interest should be resolved”, then that seems like a plausible route for them to coordinate on socially optimal joint policies.
I think this approach also avoids the sky-rocketing complexity problem, if I understand the problem you’re pointing to. (I think the problem you’re pointing to involves trying to best-respond to another agent’s cognition, which gets more difficult as that agent becomes more complicated.)
Got it, thank you!
It seems like trapped priors and commitment races are exactly the sort of cognitive dysfunction that updatelessness would solve in generality.
My understanding is that trapped priors are a symptom of a dysfunctional epistemology, which over-weights prior beliefs when updating on new observations. This results in an agent getting stuck, or even getting more and more confident in their initial position, regardless of what observations they actually make.
Similarly, commitment races are the result of dysfunctional reasoning that regards accurate information about other agents as hazardous. It seems like the consensus is that updatelessness is the general solution to infohazards.
My current model of an “updateless decision procedure”, approximated on a real computer, is something like “a policy which is continuously optimized, as an agent has more time to think, and the agent always acts according to the best policy it’s found so far.” And I like the model you use in your report, where an ecosystem of participants collectively optimize a data structure used to make decisions.
Since updateless agents use a fixed optimization criterion for evaluating policies, we can use something like an optimization market to optimize an agent’s policy. It seems easy to code up traders that identify “policies produced by (approximations of) Bayesian reasoning”, which I suspect won’t be subject to trapped priors.
So updateless agents seem like they should be able to do at least as well as updateful agents. Because they can identify updateful policies, and use those if they seem optimal. But they can also use different reasoning to identify policies like “pay Paul Ekman to drive you out of the desert”, and automatically adopt those when they lead to higher EV than updateful policies.
I suspect that the generalization of updatelessness to multi-agent scenarios will involve optimizing over the joint policy space, using a social choice theory to score joint policies. If agents agree at the meta level about “how conflicts of interest should be resolved”, then that seems like a plausible route for them to coordinate on socially optimal joint policies.
I think this approach also avoids the sky-rocketing complexity problem, if I understand the problem you’re pointing to. (I think the problem you’re pointing to involves trying to best-respond to another agent’s cognition, which gets more difficult as that agent becomes more complicated.)