Finite Factored Sets are re-framing of causality: They take us away from causal graphs and use a structure based on set partitions instead. Finite Factored Sets in Pictures summarizes and explains how that works. The language of finite factored sets seems useful to talk about and re-frame fundamental alignment concepts like embedded agents and decision theory.
I’m not completely happy with
Finite factored sets are a new way of representing causality that seems to be more capable than Pearlian causality, the state-of-the-art in causality analysis. This might be useful to create future AI systems where the causal dynamics within the model are more interpretable.
because
I wouldn’t say finite factored sets are about interpretability. I think the primary thing why they are cool is that they give us a different language to talk about causality, and thereby also about fundamental alignment concepts like embedded agents and decision theory.
Also it sounds a bit like my post introduces finite factored sets (even though you don’t say that explicitly), but it’s just a distillation of existing work.
The summary has been updated to yours for both the public newsletter and this LW linkpost. And yes, they seem exciting. Connecting FFS to interpretability was a way to contextualize it in this case, until you would provide more thoughts on the use case (given your last paragraph in the post). Thank you for writing, always appreciate the feedback!
Suggestion for a different summary of my post:
I’m not completely happy with
because
I wouldn’t say finite factored sets are about interpretability. I think the primary thing why they are cool is that they give us a different language to talk about causality, and thereby also about fundamental alignment concepts like embedded agents and decision theory.
Also it sounds a bit like my post introduces finite factored sets (even though you don’t say that explicitly), but it’s just a distillation of existing work.
The summary has been updated to yours for both the public newsletter and this LW linkpost. And yes, they seem exciting. Connecting FFS to interpretability was a way to contextualize it in this case, until you would provide more thoughts on the use case (given your last paragraph in the post). Thank you for writing, always appreciate the feedback!