The #P-complete problem is to calculate the distribution of some variables in a Bayes net given some other variables in the Bayes net, without any particular restrictions on the net or on the variables chosen.
Formal statement of the Telephone Theorem: We have a sequence of Markov blankets forming a Markov chain M1→M2→.... Then in the limit n→∞, fn(Mn) mediates the interaction between M1 and Mn (i.e. the distribution factors according to M1→fn(Mn)→Mn), for some fn satisfying
Given a network to analyze, have you considered training a GAN to generate some requested activations given some other activations? That should give straightforward estimates for the likes of mutual information between modules, and would be useful anyway to illustrate the function of a given module by generating variants of an input with the same activations in that module.
Not sure what to do about that mutual information often being infinite. Counting the dimension of the “mutual space” seems too discrete...
In case you’re fine with an approximation, you could try modelling the #P problem as a CNF (check this paper for more info) and use an approx model counter such as https://github.com/meelgroup/approxmc .
Two questions:
What exactly is the #P-complete problem you ran into?
What is the precise mathematical statement of the “Telephone Theorem”? I couldn’t find it in the linked post.
The #P-complete problem is to calculate the distribution of some variables in a Bayes net given some other variables in the Bayes net, without any particular restrictions on the net or on the variables chosen.
Formal statement of the Telephone Theorem: We have a sequence of Markov blankets forming a Markov chain M1→M2→.... Then in the limit n→∞, fn(Mn) mediates the interaction between M1 and Mn (i.e. the distribution factors according to M1→fn(Mn)→Mn), for some fn satisfying
fn(Mn)=fn+1(Mn+1)
with probability 1 in the limit.
Given a network to analyze, have you considered training a GAN to generate some requested activations given some other activations? That should give straightforward estimates for the likes of mutual information between modules, and would be useful anyway to illustrate the function of a given module by generating variants of an input with the same activations in that module.
Not sure what to do about that mutual information often being infinite. Counting the dimension of the “mutual space” seems too discrete...
In case you’re fine with an approximation, you could try modelling the #P problem as a CNF (check this paper for more info) and use an approx model counter such as https://github.com/meelgroup/approxmc .