Yes, I plan to write a sequence about it some time in the future, but here are some rough high-level sketches:
Basic assumptions: Modularity implies that the program can be broken down into loosely coupled components, for now I’ll just assume that each component has some “class definition” which specifies how it interacts with other components; “class definitions” can be reused (aka we can instantiate multiple components of the same class); each component can aggregate info from other components & the info they store can be used by other components
Expressive modularity: A problem with modularity is that it cuts out information flow between certain components, but before we learn about the world we don’t know which components are actually independent, & the modularity of the environment might change over time, so we need to account for that.
As a basic framework, we can think of each component as having transformer-style attention values over other components, modularity means that we want the “attention values”(mutual info) to be as sparse as possible
Expressivity means that those “attention values” should be context dependent (they are functions of aggregate information from other components)
A consequence of this is that we can have variables that encode the modularity structure of the environment which influence the attention values(mutual info) of other variables
One example is the eulerian vs lagrangian description of fluid flow: the eulerian description has a fixed modularity structure because each region of space has a fixed markov blanket, but the lagrangian structure has a dynamic modularity structure because “what particles are directly influenced by what other particles” depends on the positions of the particles which change over time. We want to our program be able to accomodate both types of descriptions
We can get the equivalent of “function calls” by having attention values over “class definitions”, so that components can instantiate computations of other components if it needs to. This is somewhat similar to the idea of lazy world-modelling
Components that generalize over other components: Given modularity, the main way that we can augment our program to accomodate new observations is to add more components (or tweak existing components), this means that the main way to learn efficiently is to structure our current program in a way such that we can accomodate new observations with as few additional components as possible
Since our program is made out of components, this means we want our exsting components to adapt to new components in a generalizable way
Concretely, if we think of each “component” as a causal node, then each causal node A should define a mapping FA from another causal node X to the causal edge FA(X)=X→A. This basically allows each causal node to “generalize” over other causal nodes so that it can use information from them in the right ways
Closing the loop: On top of that, we can use a part of our program to encode a compressed encoding of additional components (so that components that are more likely will be higher in the search ordering). Implementing the compressed encoding itself requires additional components, so that changes the distribution of additional components, & we can augment the compressed encoding to account for that (but that introduces a further change in distribution, and so on and so on...)
Relevance to alignment(highly speculative): Accomodating new observations by adding new components while keeping existing structures might allow us to more easily preserve a particular ontology, so that even when the AI augments it to accomodate new observations, we can still map back to the original ontology
Yes, I plan to write a sequence about it some time in the future, but here are some rough high-level sketches:
Basic assumptions: Modularity implies that the program can be broken down into loosely coupled components, for now I’ll just assume that each component has some “class definition” which specifies how it interacts with other components; “class definitions” can be reused (aka we can instantiate multiple components of the same class); each component can aggregate info from other components & the info they store can be used by other components
As a basic framework, we can think of each component as having transformer-style attention values over other components, modularity means that we want the “attention values”(mutual info) to be as sparse as possible
Expressivity means that those “attention values” should be context dependent (they are functions of aggregate information from other components)
A consequence of this is that we can have variables that encode the modularity structure of the environment which influence the attention values(mutual info) of other variables
One example is the eulerian vs lagrangian description of fluid flow: the eulerian description has a fixed modularity structure because each region of space has a fixed markov blanket, but the lagrangian structure has a dynamic modularity structure because “what particles are directly influenced by what other particles” depends on the positions of the particles which change over time. We want to our program be able to accomodate both types of descriptions
We can get the equivalent of “function calls” by having attention values over “class definitions”, so that components can instantiate computations of other components if it needs to. This is somewhat similar to the idea of lazy world-modelling
Components that generalize over other components: Given modularity, the main way that we can augment our program to accomodate new observations is to add more components (or tweak existing components), this means that the main way to learn efficiently is to structure our current program in a way such that we can accomodate new observations with as few additional components as possible
Since our program is made out of components, this means we want our exsting components to adapt to new components in a generalizable way
Concretely, if we think of each “component” as a causal node, then each causal node A should define a mapping FA from another causal node X to the causal edge FA(X)=X→A. This basically allows each causal node to “generalize” over other causal nodes so that it can use information from them in the right ways
Closing the loop: On top of that, we can use a part of our program to encode a compressed encoding of additional components (so that components that are more likely will be higher in the search ordering). Implementing the compressed encoding itself requires additional components, so that changes the distribution of additional components, & we can augment the compressed encoding to account for that (but that introduces a further change in distribution, and so on and so on...)
Relevance to alignment(highly speculative): Accomodating new observations by adding new components while keeping existing structures might allow us to more easily preserve a particular ontology, so that even when the AI augments it to accomodate new observations, we can still map back to the original ontology
I also have some intuition that some of these ideas can allow us to more naturally represent things like counterfactuals, boundaries, natural latents and general purpose search in the world model
Note: I haven’t thought of the best framing of these ideas but hopefully I’ll come back with a better presentation some point in the future
I think that this would make a very nice sequence, and despite all my discussion with you, I’d absolutely like to see this sequence carried out.
Thanks! :)