Desideratum 1: There should be a sensible notion of what it means to update a set of environments or a set of distributions, which should also give us dynamic consistency.
I’m not sure how important dynamic consistency should be. When I talk about model splintering, I’m thinking of a bounded agent making fundamental changes to their model (though possibly gradually), a process that is essentially irreversible and contingent the circumstance of discovering new scenarios. The strongest arguments for dynamic consistency are the Dutch-book type arguments, which depend on returning to a scenario very similar to the starting scenario, and these seem absent from model splintering as I’m imagining it.
Now, adding dynamic inconsistency is not useful, it just seems that removing all of it (especially for a bounded agent) doesn’t seem worth the effort.
Is there some form of “not loose too much utility to dynamic inconsistency” requirement that could be formalised?
I’m not sure why would we need a weaker requirement if the formalism already satisfies a stronger requirement? Certainly when designing concrete learning algorithms we might want to use some kind of simplified update rule, but I expect that to be contingent on the type of algorithm and design constraints. We do have some speculations in that vein, for example I suspect that, for communicating infra-MDPs, an update rule that forgets everything except the current state would only lose something like O(1−γ) expected utility.
I want a formalism capable of modelling and imitating how humans handle these situations, and we don’t usually have dynamic consistency (nor do boundedly rational agents).
Now, I don’t want to weaken requirements “just because”, but it may be that dynamic consistency is too strong a requirement to properly model what’s going on. It’s also useful to have AIs model human changes of morality, to figure out what humans count as values, so getting closer to human reasoning would be necessary.
Boundedly rational agents definitely can have dynamic consistency, I guess it depends on just how bounded you want them to be. IIUC what you’re looking for is a model that can formalize “approximately rational but doesn’t necessary satisfy any crisp desideratum”. In this case, I would use something like my quantitative AIT definition of intelligence.
I don’t know, we’re hunting for it, relaxations of dynamic consistency would be extremely interesting if found, and I’ll let you know if we turn up with anything nifty.
Hum… how about seeing enforcement of dynamic consistency as having a complexity/computation cost, and Dutch books (by other agents or by the environment) providing incentives to pay the cost? And hence the absence of these Dutch books meaning there is little incentive to pay that cost?
I’m not sure how important dynamic consistency should be. When I talk about model splintering, I’m thinking of a bounded agent making fundamental changes to their model (though possibly gradually), a process that is essentially irreversible and contingent the circumstance of discovering new scenarios. The strongest arguments for dynamic consistency are the Dutch-book type arguments, which depend on returning to a scenario very similar to the starting scenario, and these seem absent from model splintering as I’m imagining it.
Now, adding dynamic inconsistency is not useful, it just seems that removing all of it (especially for a bounded agent) doesn’t seem worth the effort.
Is there some form of “not loose too much utility to dynamic inconsistency” requirement that could be formalised?
I’m not sure why would we need a weaker requirement if the formalism already satisfies a stronger requirement? Certainly when designing concrete learning algorithms we might want to use some kind of simplified update rule, but I expect that to be contingent on the type of algorithm and design constraints. We do have some speculations in that vein, for example I suspect that, for communicating infra-MDPs, an update rule that forgets everything except the current state would only lose something like O(1−γ) expected utility.
I want a formalism capable of modelling and imitating how humans handle these situations, and we don’t usually have dynamic consistency (nor do boundedly rational agents).
Now, I don’t want to weaken requirements “just because”, but it may be that dynamic consistency is too strong a requirement to properly model what’s going on. It’s also useful to have AIs model human changes of morality, to figure out what humans count as values, so getting closer to human reasoning would be necessary.
Boundedly rational agents definitely can have dynamic consistency, I guess it depends on just how bounded you want them to be. IIUC what you’re looking for is a model that can formalize “approximately rational but doesn’t necessary satisfy any crisp desideratum”. In this case, I would use something like my quantitative AIT definition of intelligence.
I don’t know, we’re hunting for it, relaxations of dynamic consistency would be extremely interesting if found, and I’ll let you know if we turn up with anything nifty.
Hum… how about seeing enforcement of dynamic consistency as having a complexity/computation cost, and Dutch books (by other agents or by the environment) providing incentives to pay the cost? And hence the absence of these Dutch books meaning there is little incentive to pay that cost?