The way I see it, the primary value of this work (as well as other CID work) is conceptual clarification. Causality is a really fundamental concept, which many other AI-safety relevant concepts build on (influence, response, incentives, agency, …). The primary aim is to clarify the relationships between concepts and to derive relevant implications. Whether there are practical causal inference algorithms or not is almost irrelevant.
This note runs against the fact that in the paper, you repeatedly use language like “causal experiments”, “empirical data”, “real systems”, etc.
[Paper’s contributions] ground game graph representations of agents in causal experiments. These experiments can be applied to real systems, or used in thought-experiments to determine the correct game graph and resolve confusions (see Section 4).
Sorry, I worded that slightly too strongly. It is important that causal experiments can in principle be used to detect agents. But to me, the primary value of this isn’t that you can run a magical algorithm that lists all the agents in your environment. That’s not possible, at least not yet. Instead, the primary value (as i see it) is that the experiment could be run in principle, thereby grounding our thinking. This often helps, even if we’re not actually able to run the experiment in practice.
I interpreted your comment as “CIDs are not useful, because causal inference is hard”. I agree that causal inference is hard, and unlikely to be automated anytime soon. But to me, automatic inference of graphs was never the intended purpose of CIDs.
Instead, the main value of CIDs is that they help make informal, philosophical arguments crisp, by making assumptions and inferences explicit in a simple-to-understand formal language.
So it’s from this perspective that I’m not overly worried about the practicality of the experiments.
The way I see it, the primary value of this work (as well as other CID work) is conceptual clarification. Causality is a really fundamental concept, which many other AI-safety relevant concepts build on (influence, response, incentives, agency, …). The primary aim is to clarify the relationships between concepts and to derive relevant implications. Whether there are practical causal inference algorithms or not is almost irrelevant.
TLDR: Causality > Causal inference :)
This note runs against the fact that in the paper, you repeatedly use language like “causal experiments”, “empirical data”, “real systems”, etc.
Sorry, I worded that slightly too strongly. It is important that causal experiments can in principle be used to detect agents. But to me, the primary value of this isn’t that you can run a magical algorithm that lists all the agents in your environment. That’s not possible, at least not yet. Instead, the primary value (as i see it) is that the experiment could be run in principle, thereby grounding our thinking. This often helps, even if we’re not actually able to run the experiment in practice.
I interpreted your comment as “CIDs are not useful, because causal inference is hard”. I agree that causal inference is hard, and unlikely to be automated anytime soon. But to me, automatic inference of graphs was never the intended purpose of CIDs.
Instead, the main value of CIDs is that they help make informal, philosophical arguments crisp, by making assumptions and inferences explicit in a simple-to-understand formal language.
So it’s from this perspective that I’m not overly worried about the practicality of the experiments.