One of the primary motivations of Finite Factored Sets (shorthand: FFS) that initially caught my eye was, to quote Scott:
“Given a collection of variables and a joint probability distribution over those variables, Pearl can infer causal/temporal relationships between the variables.”, the words “Given a collection of variables” are actually hiding a lot of the work.
And this becomes apparent with the toy model Magdalena analyzes at the end of her distillation post: taking variables as primitives (as in the Pearl framework) means you have to make arguments about ‘whether deterministic collapse occurs’, rather than variables arising naturally as in finite factored sets.
So:
Is anyone looking into finding an efficient algorithm for scaling up finite factored sets to larger regimes, where you could efficiently discover a dozen (or hundreds) of causal variables from data?
I think the FFS way to express this, would be a set with large cardinality and variety?
Or, more intuitively speaking: Are there “takeaways” from the FFS perspective that could be tacked-on to the traditional line-of-thinking of causal representation?
EDIT: It may help to know that my motivation is “Can we apply a FFS algorithm for causal representation learning to learn objects (and physics) from a video? Or (more directly for alignment), to identify latent concepts embedded in a LLM?”
I’m working on the FFS framework in general. I’m currently writing up decidability of finite temporal inference. After this I will probably start working on efficient finite temporal inference which is what you’re referencing if I understood correctly.
I’m also working on extending the framework to the infinite setting and am almost finished except for conditional orthogonality for uncountable sets.
I quite like the name Logical Time Theory, under which I will probably publish those results in a month or so.
Hmm, what would be the intuition/application behind the uncountable setting? Like, when would one want that (I don’t mind if it’s niche, I’m just struggling to come up with anything)?
A direct application would need that you have an uncountable variable. You might want to do this if you have enough evidence to say this confidently. As a simple example imagine a real-valued graph where all your data points lie almost on the identity diagonal. You might then want to infer a variable which is the identity.
As a more general application, we want to model infinities because the world is probably infinite in some aspects. We then want a theorem that tells us, that even if the underlying model is infinite, if you have enough data points then you are close enough, like with the Strong law of Large numbers, for example.