Trying to summarize your viewpoint, lmk if I’m missing something important:
Training self-organizing models on multi-modal input will lead to increased modularization and in turn to more interpretability
Existing interpretability techniques might more or less transfer to self-organizing systems
There are low-hanging fruits in applied interpretability that we could exploit should we need them in order to understand self-organizing systems
(Not going into the specific proposals for sake of brevity and clarity)
Trying to summarize your viewpoint, lmk if I’m missing something important:
Training self-organizing models on multi-modal input will lead to increased modularization and in turn to more interpretability
Existing interpretability techniques might more or less transfer to self-organizing systems
There are low-hanging fruits in applied interpretability that we could exploit should we need them in order to understand self-organizing systems
(Not going into the specific proposals for sake of brevity and clarity)