Maybe related: A paper likely to get an oral at ICLR 2024. I haven’t read it, but I think it substantially improves on the Good Regulator Theorem. I think their Theorem 1 shows that from an optimal policy, you can identify (deduce) the exact causal model of the data generating process, and Theorem 2 shows that from a policy satisfying regret bounds, you can identify an approximate causal model. The assumptions are far weaker and more realistic than being the simplest policy that can perfectly regulate some variable.
Robust agents learn causal world models
[...]
We prove that agents that are capable of adapting to distributional shifts must have learned a causal model of their environment, establishing a formal equivalence between causality and transfer learning.
Maybe related: A paper likely to get an oral at ICLR 2024. I haven’t read it, but I think it substantially improves on the Good Regulator Theorem. I think their Theorem 1 shows that from an optimal policy, you can identify (deduce) the exact causal model of the data generating process, and Theorem 2 shows that from a policy satisfying regret bounds, you can identify an approximate causal model. The assumptions are far weaker and more realistic than being the simplest policy that can perfectly regulate some variable.
Here’s the arxiv version of the paper, with a bunch more content in appendices.