I haven’t read the posts that you’re referencing, but I would assume that GPT would exhibit learned modularity—modules that reflect the underlying structure of its training data—rather than innately encoded modularity. E.g. CLIP also ends up having a “Spiderman neuron” that activates when it sees features associated with Spiderman, so you could kind of say that there’s a “Spiderman module”, but nobody ever sat down to specifically write code that would ensure the emergence of a Spiderman module in CLIP.
Likewise, experimental results like the Wason Selection Task seem to me explainable as outcomes of within-lifetime learning that does end up creating a modular structure out of the data—without there needing to be any particular evolutionary hardwiring for it.
Specifying the dataset is one way to ensure some collection of neurons will represent Spiderman specifically, even when it’s not on purpose. « Pay attention to face » sounds enough to make our dataset full of social information, maybe enough to ensure a cheating-detector module (most likely a distributed representation) emerges.
Suppose the universal-learning-machine side of the debate is correct. Then the genome builds a big within-lifetime learning algorithm, and this learning algorithm does gradient descent (or whatever other learning rule) and thus gradually builds a trained model in the animal’s brain as it gets older and wiser. It’s possible that this trained model will turn out to be modular. It’s also possible that it won’t. I don’t know which will happen—it’s an interesting question. Maybe I could find out the answer by reading that sequence you linked. But whatever the answer is, this question is not related to the evolved-modularity-vs-universal-learning-machine debate. This whole paragraph is universal-learning-machine either way, by assumption.
By contrast, the evolved modularity side of the debate would NOT look like the genome building a big within-lifetime learning algorithm in the first place. Rather it would look like the genome building an “intuitive biology” algorithm, and an “intuitive physics” algorithm, and an “intuitive human social relations” algorithm, and a vision-processing algorithm, and various other things, with all those algorithms also incorporating learning (somehow—the details here tend to be glossed over IMO).
GPT is likely highly modular itself. Most ML models that generalize well are.
I haven’t read the posts that you’re referencing, but I would assume that GPT would exhibit learned modularity—modules that reflect the underlying structure of its training data—rather than innately encoded modularity. E.g. CLIP also ends up having a “Spiderman neuron” that activates when it sees features associated with Spiderman, so you could kind of say that there’s a “Spiderman module”, but nobody ever sat down to specifically write code that would ensure the emergence of a Spiderman module in CLIP.
Likewise, experimental results like the Wason Selection Task seem to me explainable as outcomes of within-lifetime learning that does end up creating a modular structure out of the data—without there needing to be any particular evolutionary hardwiring for it.
Specifying the dataset is one way to ensure some collection of neurons will represent Spiderman specifically, even when it’s not on purpose. « Pay attention to face » sounds enough to make our dataset full of social information, maybe enough to ensure a cheating-detector module (most likely a distributed representation) emerges.
I think that’s a different topic.
We’re talking about the evolved-modularity-vs-universal-learning-machine debate.
Suppose the universal-learning-machine side of the debate is correct. Then the genome builds a big within-lifetime learning algorithm, and this learning algorithm does gradient descent (or whatever other learning rule) and thus gradually builds a trained model in the animal’s brain as it gets older and wiser. It’s possible that this trained model will turn out to be modular. It’s also possible that it won’t. I don’t know which will happen—it’s an interesting question. Maybe I could find out the answer by reading that sequence you linked. But whatever the answer is, this question is not related to the evolved-modularity-vs-universal-learning-machine debate. This whole paragraph is universal-learning-machine either way, by assumption.
By contrast, the evolved modularity side of the debate would NOT look like the genome building a big within-lifetime learning algorithm in the first place. Rather it would look like the genome building an “intuitive biology” algorithm, and an “intuitive physics” algorithm, and an “intuitive human social relations” algorithm, and a vision-processing algorithm, and various other things, with all those algorithms also incorporating learning (somehow—the details here tend to be glossed over IMO).