If I get it correctly, the core idea is that “consider every possible scenario, use a maximin policy while caring about conterfactual branches”, which is very similar to the idea presented in the linked post. The “Nirvana trick” in the other post is similar to just eliminating branches/cells, where the agent would take a different action from the predicted policy.
Non-Nashian Game Theory is Pareto optimal, Infra-Bayesianism implements Updateless Decision Theory. If the two are connected, that could mean that UDT and Pareto-optimality are connected too.
Have you heard about Infra-Bayesianism?
If I get it correctly, the core idea is that “consider every possible scenario, use a maximin policy while caring about conterfactual branches”, which is very similar to the idea presented in the linked post. The “Nirvana trick” in the other post is similar to just eliminating branches/cells, where the agent would take a different action from the predicted policy.
Non-Nashian Game Theory is Pareto optimal, Infra-Bayesianism implements Updateless Decision Theory. If the two are connected, that could mean that UDT and Pareto-optimality are connected too.