First of all, I strongly agree that intelligence requires (or is exponentially easier to develop as) connectionist systems. However, I think that while big, inscrutable matrices may be unavoidable, there is plenty of room to make models more interpretable at an architectural level.
Well, I ask you—do you think any other ML model, trained over the domain of all human text, with sufficient success to reach GPT-4 level perplexity, would turn out to be simpler?
I have long thought that Transformer models are actually too general purpose for their own good. By that I mean that the O(n2) operations they do, using all-to-all token comparisons for self-attention, is actually extreme overkill for what an LLM needs to do.
Sure, you can use this architecture for moving tokens around and building implicit parse trees and semantic maps and a bunch of other things, but all these functions are jumbled together in the same operations and are really hard to tease out. Recurrent models with well-partitioned internal states and disentangled token operations could probably do more with less. Sure, you can build a computer in Conway’s Game of Life (which is Turing-complete), but using a von Neumann architecture would be much easier to work with.
Embedded within Transformer circuits, you can find implicit representations of world models, but you could do even better from an interpretability standpoint by making such maps explicit. Give an AI a mental scratchpad that it depends on for reasoning (DALL-E, Stable Diffusion, etc. sort of do this already, except that the mental scratchpad is the output of the model [an image] rather than an internal map of conceptual/planning space), and you can probe that directly to see what the AI is thinking about.
Real brains tend to be highly modular, as Nathan Helm-Burger pointed out. The cortex maps out different spaces (visual, somatosensory, conceptual, etc.). The basal ganglia perform action selection and general information routing. The cerebellum fine-tunes top-down control signals. Various nuclei control global and local neuromodulation. And so on. I would argue that such modular constraints actually made it easier for evolution to explore the space of possible cognitive architectures.
I think the problem is that many things along the rough lines of what you’re describing have been attempted in the past, and have turned out to work not-so-well (TBF, they also were attempted with older systems: not even sure if anyone’s tried to make something like an expert system with a full-fledged transformer). The common wisdom the field has derived from those experiences seems to have been “stochastic gradient descent knows best, just throw your data into a function and let RNJesus sort it out”. Which is… not the best lesson IMO. I think there might be worth in the things you suggest and they are intuitively appealing to me too. But as it turns out when an industry is driven by racing to the goal rather than a genuine commitment to proper scientific understanding they end up taking all sorts of shortcuts. Who could have guessed.
First of all, I strongly agree that intelligence requires (or is exponentially easier to develop as) connectionist systems. However, I think that while big, inscrutable matrices may be unavoidable, there is plenty of room to make models more interpretable at an architectural level.
I have long thought that Transformer models are actually too general purpose for their own good. By that I mean that the O(n2) operations they do, using all-to-all token comparisons for self-attention, is actually extreme overkill for what an LLM needs to do.
Sure, you can use this architecture for moving tokens around and building implicit parse trees and semantic maps and a bunch of other things, but all these functions are jumbled together in the same operations and are really hard to tease out. Recurrent models with well-partitioned internal states and disentangled token operations could probably do more with less. Sure, you can build a computer in Conway’s Game of Life (which is Turing-complete), but using a von Neumann architecture would be much easier to work with.
Embedded within Transformer circuits, you can find implicit representations of world models, but you could do even better from an interpretability standpoint by making such maps explicit. Give an AI a mental scratchpad that it depends on for reasoning (DALL-E, Stable Diffusion, etc. sort of do this already, except that the mental scratchpad is the output of the model [an image] rather than an internal map of conceptual/planning space), and you can probe that directly to see what the AI is thinking about.
Real brains tend to be highly modular, as Nathan Helm-Burger pointed out. The cortex maps out different spaces (visual, somatosensory, conceptual, etc.). The basal ganglia perform action selection and general information routing. The cerebellum fine-tunes top-down control signals. Various nuclei control global and local neuromodulation. And so on. I would argue that such modular constraints actually made it easier for evolution to explore the space of possible cognitive architectures.
I think the problem is that many things along the rough lines of what you’re describing have been attempted in the past, and have turned out to work not-so-well (TBF, they also were attempted with older systems: not even sure if anyone’s tried to make something like an expert system with a full-fledged transformer). The common wisdom the field has derived from those experiences seems to have been “stochastic gradient descent knows best, just throw your data into a function and let RNJesus sort it out”. Which is… not the best lesson IMO. I think there might be worth in the things you suggest and they are intuitively appealing to me too. But as it turns out when an industry is driven by racing to the goal rather than a genuine commitment to proper scientific understanding they end up taking all sorts of shortcuts. Who could have guessed.