I found this clear and useful, thanks. Particularly the notes about compositional structure. For what it’s worth I’ll repeat here a comment from ILIAD, which is that there seems to be something in the direction of SAEs, approximate sufficient statistics/information bottleneck, the work of Achille-Soatto and SLT (Section 5 iirc) which I had looked into after talking with Olah and Wattenberg about feature geometry but which isn’t currently a high priority for us. Somebody might want to pick that up.
Are you suggesting that there should be a formula similar to the one in Proposition 5.1 (or 5.2) that links information about the activations I(z;x)+TC(z) with the LC as measure of basin flatness?
I found this clear and useful, thanks. Particularly the notes about compositional structure. For what it’s worth I’ll repeat here a comment from ILIAD, which is that there seems to be something in the direction of SAEs, approximate sufficient statistics/information bottleneck, the work of Achille-Soatto and SLT (Section 5 iirc) which I had looked into after talking with Olah and Wattenberg about feature geometry but which isn’t currently a high priority for us. Somebody might want to pick that up.
Are you suggesting that there should be a formula similar to the one in Proposition 5.1 (or 5.2) that links information about the activations I(z;x)+TC(z) with the LC as measure of basin flatness?
Something in that direction, yeah