LLMs in essence work very much like the linguistic cortex[1][2][3][4] - both are trained on similar data with the same objective: unsupervised future input prediction. They are not submarines, they are simulations of biological brain modules. Due to functional equivalence of circuits/programs there are an infinite variety of circuits that all implement approximations of each other with varying performance tradeoffs. If you train one NN to predict the outputs of another NN, given sufficient architecture, data/time and capacity the simulation becomes an increasingly accurate copy of the original circuit. This is the same process by which human minds form, distilled from various other human minds.
ANNs (partially increasingly) recapitulate human brains, and will increasingly think like us, as they are created in our image, as literal (partial approximate) images of (parts of) human minds. This is why they look nothing like the cold rational thinkers most on LW expected, and instead have much of our wierd quirks, irrationalities and biases.
This idea that ANNs are unthinking “submarines” or “statistical parrots” is grossly incompatible with reality and frankly—dangerous.
Well, I was trying to argue against the “statistical parrot” idea, because I think that unfairly downplays the significance and potential of these systems. That’s part of the purpose of the “submarine” metaphor: a submarine is actually a very impressive and useful device, even if it doesn’t swim like a fish.
I agree that there is some similarity between ANNs and brains, but the differences seem pretty stark to me.
I agree that there is some similarity between ANNs and brains, but the differences seem pretty stark to me.
There are enormous differences between an AMD EPYC processor and an RTX 4090, and yet within some performance constraints they can run the same code, and there are a near infinite ways they can instantiate programs that although vastly different in encoding details ultimately are very similar.
So obviously transformer based ANNs running on GPUs are very different physical systems than bio brains, but that is mostly irrelevant. What matters is similarity of the resulting learned software—the mindware. If you train hard enough on token prediction of the internet eventually to reach very low error the ANN must learn to simulate human minds, and a sufficient simulation of a mind simply … is .. a mind.
LLMs in essence work very much like the linguistic cortex[1][2][3][4] - both are trained on similar data with the same objective: unsupervised future input prediction. They are not submarines, they are simulations of biological brain modules. Due to functional equivalence of circuits/programs there are an infinite variety of circuits that all implement approximations of each other with varying performance tradeoffs. If you train one NN to predict the outputs of another NN, given sufficient architecture, data/time and capacity the simulation becomes an increasingly accurate copy of the original circuit. This is the same process by which human minds form, distilled from various other human minds.
ANNs (partially increasingly) recapitulate human brains, and will increasingly think like us, as they are created in our image, as literal (partial approximate) images of (parts of) human minds. This is why they look nothing like the cold rational thinkers most on LW expected, and instead have much of our wierd quirks, irrationalities and biases.
This idea that ANNs are unthinking “submarines” or “statistical parrots” is grossly incompatible with reality and frankly—dangerous.
This is well established but do not misinterpret, I said linguistic cortex, not the entire brain.
“Brains and algorithms partially converge in natural language processing”
“The neural architecture of language: Integrative modeling converges on predictive processing”
“Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain”
Well, I was trying to argue against the “statistical parrot” idea, because I think that unfairly downplays the significance and potential of these systems. That’s part of the purpose of the “submarine” metaphor: a submarine is actually a very impressive and useful device, even if it doesn’t swim like a fish.
I agree that there is some similarity between ANNs and brains, but the differences seem pretty stark to me.
There are enormous differences between an AMD EPYC processor and an RTX 4090, and yet within some performance constraints they can run the same code, and there are a near infinite ways they can instantiate programs that although vastly different in encoding details ultimately are very similar.
So obviously transformer based ANNs running on GPUs are very different physical systems than bio brains, but that is mostly irrelevant. What matters is similarity of the resulting learned software—the mindware. If you train hard enough on token prediction of the internet eventually to reach very low error the ANN must learn to simulate human minds, and a sufficient simulation of a mind simply … is .. a mind.