I think there’s a lot of cumulated evidence pointing against the view that LLMs are (very) alien and pointing towards their semantics being quite similar to those of humans (though of course not identical). E.g. have a look at papers (comparing brains to LLMs) from the labs of Ev Fedorenko, Uri Hasson, Jean-Remi King, Alex Huth (or twitter thread summaries).
Can you link to some specific papers here? I’ve looked into 1-2 papers of this genre in the last few months, and they seemed very weak to me, but you might have links to better papers, and I would be interested in checking them out.
Thanks for engaging. Can you say more about which papers you’ve looked at / in which ways they seemed very weak? This will help me adjust what papers I’ll send; otherwise, I’m happy to send a long list.
Also, to be clear, I don’t think any specific paper is definitive evidence, I’m mostly swayed by the cumulated evidence from all the work I’ve seen (dozens of papers), with varying methodologies, neuroimaging modalities, etc.
These papers are interesting, thanks for compiling them!
Skimming through some of them, the sense I get is that they provide evidence for the claim that the structure and function of LLMs is similar to (and inspired by) the structure of particular components of human brains, namely, the components which do language processing.
This is slightly different from the claim I am making, which is about how the cognition of LLMs compares to the cognition of human brains as a whole. My comparison is slightly unfair, since I’m comparing a single forward pass through an LLM to get a prediction of the next token, to a human tasked with writing down an explicit probability distribution on the next token, given time to think, research, etc. [1]
Also, LLM capability at language processing / text generation is already far superhuman (by some metrics). The architecture of LLMs may be simpler than the comparable parts of the brain’s architecture in some ways, but the LLM version can run with far more precision / scale / speed than a human brain. Whether or not LLMs are already exceeding human brains by specific metrics is debatable / questionable, but they are not bottlenecked on further scaling by biology.
And this is to say nothing of all the other kinds of cognition that happens in the brain. I see these brain components as analogous to LangChain or AutoGPT, if LangChain or AutoGPT themselves were written as ANNs that interfaced “natively” with the transformers of an LLM, instead of as Python code.
Finally, similarity of structure doesn’t imply similarity of function. I elaborated a bit on this in a comment thread here.
You might be able to get better predictions from an LLM by giving it more “time to think”, using chain-of-thought prompting or other methods. But these are methods humans use when using LLMs as a tool, rather than ideas which originate from within the LLM itself, so I don’t think it’s exactly fair to call them “LLM cognition” on their own.
Re the superhuman next prediction ability, there’s an issue in which the evaluations are fairly distorted in ways which make humans artificially worse than they actually are at next-token prediction, see here:
I think there’s a lot of cumulated evidence pointing against the view that LLMs are (very) alien and pointing towards their semantics being quite similar to those of humans (though of course not identical). E.g. have a look at papers (comparing brains to LLMs) from the labs of Ev Fedorenko, Uri Hasson, Jean-Remi King, Alex Huth (or twitter thread summaries).
Can you link to some specific papers here? I’ve looked into 1-2 papers of this genre in the last few months, and they seemed very weak to me, but you might have links to better papers, and I would be interested in checking them out.
Thanks for engaging. Can you say more about which papers you’ve looked at / in which ways they seemed very weak? This will help me adjust what papers I’ll send; otherwise, I’m happy to send a long list.
Also, to be clear, I don’t think any specific paper is definitive evidence, I’m mostly swayed by the cumulated evidence from all the work I’ve seen (dozens of papers), with varying methodologies, neuroimaging modalities, etc.
Alas, I can’t find the one or two that I looked at quickly. It came up in a recent Twitter conversation, I think with Quintin?
Can’t speak for Habryka, but I would be interested in just seeing the long list.
Here goes (I’ve probably still missed some papers, but the most important ones are probably all here):
Brains and algorithms partially converge in natural language processing
Shared computational principles for language processing in humans and deep language models
Deep language algorithms predict semantic comprehension from brain activity
The neural architecture of language: Integrative modeling converges on predictive processing (video summary); though maybe also see Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data
Brain embeddings with shared geometry to artificial contextual embeddings, as a code for representing language in the human brain
Artificial neural network language models align neurally and behaviorally with humans even after a developmentally realistic amount of training
Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain
Reconstructing the cascade of language processing in the brain using the internal computations of a transformer-based language model
Linguistic brain-to-brain coupling in naturalistic conversation
Semantic reconstruction of continuous language from non-invasive brain recordings
Driving and suppressing the human language network using large language models
Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network
Training language models for deeper understanding improves brain alignment
Natural language processing models reveal neural dynamics of human conversation
Semantic Representations during Language Comprehension Are Affected by Context
Unpublished—scaling laws for predicting brain data (larger LMs are better), potentially close to noise ceiling (90%) for some brain regions with largest models
Twitter accounts of some of the major labs and researchers involved (especially useful for summaries):
https://twitter.com/HassonLab
https://twitter.com/JeanRemiKing
https://twitter.com/ev_fedorenko
https://twitter.com/alex_ander
https://twitter.com/martin_schrimpf
https://twitter.com/samnastase
https://twitter.com/mtoneva1
These papers are interesting, thanks for compiling them!
Skimming through some of them, the sense I get is that they provide evidence for the claim that the structure and function of LLMs is similar to (and inspired by) the structure of particular components of human brains, namely, the components which do language processing.
This is slightly different from the claim I am making, which is about how the cognition of LLMs compares to the cognition of human brains as a whole. My comparison is slightly unfair, since I’m comparing a single forward pass through an LLM to get a prediction of the next token, to a human tasked with writing down an explicit probability distribution on the next token, given time to think, research, etc. [1]
Also, LLM capability at language processing / text generation is already far superhuman (by some metrics). The architecture of LLMs may be simpler than the comparable parts of the brain’s architecture in some ways, but the LLM version can run with far more precision / scale / speed than a human brain. Whether or not LLMs are already exceeding human brains by specific metrics is debatable / questionable, but they are not bottlenecked on further scaling by biology.
And this is to say nothing of all the other kinds of cognition that happens in the brain. I see these brain components as analogous to LangChain or AutoGPT, if LangChain or AutoGPT themselves were written as ANNs that interfaced “natively” with the transformers of an LLM, instead of as Python code.
Finally, similarity of structure doesn’t imply similarity of function. I elaborated a bit on this in a comment thread here.
You might be able to get better predictions from an LLM by giving it more “time to think”, using chain-of-thought prompting or other methods. But these are methods humans use when using LLMs as a tool, rather than ideas which originate from within the LLM itself, so I don’t think it’s exactly fair to call them “LLM cognition” on their own.
Re the superhuman next prediction ability, there’s an issue in which the evaluations are fairly distorted in ways which make humans artificially worse than they actually are at next-token prediction, see here:
https://www.lesswrong.com/posts/htrZrxduciZ5QaCjw/language-models-seem-to-be-much-better-than-humans-at-next#wPwSND5mfQ7ncruWs
Thanks!
they’re somewhat alien, not highly alien, agreed