I’ve been thinking about this comment every day since you made it 11 days ago. I love it. Maybe it’s silly of me, but I just hadn’t thought about the question in such a grounded empirical manner before.
I agree with you that it seems unlikely that current transformer-based LLMs are conscious. I also agree that we would need to be able to find extra context-dependent computation present in the stream of calculations in order to say that there was some consciousness-related computation present.
I also agree that it is hard to imagine how consciousness would provide a clear benefit on the task of next-token-prediction on web text.
I disagree though on the extrapolation from the above points. Let me explain.
Assume, for this hypothetical, that we are analyzing a future model which has some things in common with transformer-based LLMs but also some extra components. We can get into the details of plausibly useful extra components if you like, but for now let’s just say that this is a diffusion-guided transformer as an example.
Now let’s also assume that this future model wasn’t trained on web text, but was instead trained in some moderately realistic simulation of surviving in the wild as an early homonid tribe member. They need to track simulated hunger, hunting and gathering skills, and social relationships. They had a constant simulated state of health/homeostasis throughout training, as an RL signal proportional to intensity of simulated need. So there was a constant combination of training pressure for next token prediction and for satisficing the simulated state homeostasis.
Now, in this hypothetical, it seems more fair to compare this model to an animal. Supposing that the intuitive understanding of a common feature of behaviors across animal species (particularly mammals, marsupials, and birds) is correct. It seems like all these animals are running some sort of computational process which could fairly be described as a form of ‘consciousness’. Why would this be a common computational process evolved and maintained across many species if it weren’t useful in some way? Neural computation is expensive. Especially so for flighted birds. Yet some flighted birds, like corvids, seem both conscious and remarkably intelligent. Relatedly, they can be reasonably be described as curious, playful, puzzle-solving, and with detailed long-lasting memories.
Since consciousness seems useful for all these different species, in a convergent-evolution pattern even across very different brain architectures (mammals vs birds), then I believe we should expect it to be useful in our homonid-simulator-trained model. If so, we should be able to measure this difference to a next-token-predictor trained on an equivalent number of tokens of a dataset of, for instance, math problems.
Sorry for the late response. I don’t really use this forum regularly. But to get back to it—the main reason neural networks generalize is that they find the simplest function that gets a given accuracy on the training data.
This holds true for all neural networks, regardless of how they are trained, what type of data they are trained on, or what the objective function is. It’s the whole point of why neural networks work. Functions that have more high frequency components are exponentially more unlikely. This holds for the randomly initialized prior (see arxiv.org/pdf/1907.10599) and throughout training, as the averaging part of SGD allows lower frequency components to be learned faster than higher frequency ones (see [1806.08734] On the Spectral Bias of Neural Networks).
You can have any objective function you want; it doesn’t change this basic fact. If this basic fact didn’t hold, the neural network wouldn’t generalize and would be useless. There are many papers that formalize this and provide generalization bounds based off of the complexity of the function learned by the neural network.
A “conscious” neural network doesn’t increase the accuracy over a neural network encoding the same function sans consciousness but does increase the complexity of the function. Therefore, it’s exponentially more unlikely.
I think biological systems are really different from silicon ones. The biggest difference is that biological systems are able to generate their own randomness. Silicon ones are not—they’re deterministic. If a NN is probabilistic, it’s because we are feeding it random samples as an input. I think consciousness is a precursor for free will, which can be valuable for inherently non-deterministic biological systems.
In my original post, I had linked a recent paper that finds suggestive evidence that the brain is non-classical (e.g. undergoes quantum computation) but deleted it after someone told me to.
More generally, I feel that for folks concerned about AI safety, the first step is to develop a solid theoretical understanding of why neural networks generalize, the types of functions they are biased towards, how this bias is affected by the # of layers, etc.
I feel that most individuals on Less Wrong lack this knowledge because they exclusively consume content from individuals within the rationality/AI safety sphere. I think this leads to a lot of outlandish conjectures (e.g. AI conscious, paperclip maximizer, etc.) that don’t make sense.
Since consciousness seems useful for all these different species, in a convergent-evolution pattern even across very different brain architectures (mammals vs birds), then I believe we should expect it to be useful in our homonid-simulator-trained model. If so, we should be able to measure this difference to a next-token-predictor trained on an equivalent number of tokens of a dataset of, for instance, math problems.
What do you mean by difference here? Increase in performance due to consciousness? Or differences in functions?
I’m not sure we could measure this difference. It seems very likely to me that consciousness evolved before, say, language and complex agency. But complex language and complex agency might not require consciousness, and may capture all of the benefits that would be captured by consciousness, so consciousness wouldn’t result in greater performance.
However, it could be that
humans do not consistently have complex language and complex agency, and humans with agency are fallible as agents, so consciousness in most humans is still useful to us as a species (or to our genes),
building complex language and complex agency on top of consciousness is the locally cheapest way to build them, so consciousness would still be useful to us, or
we reached a local maximum in terms of genetic fitness, or evolutionary pressures are too weak on us now, and it’s not really possible to evolve away consciousness while preserving complex language and complex agency. So consciousness isn’t useful to us, but can’t be practically gotten rid of without loss in fitness.
Some other possibilities:
The adaptive value of consciousness is really just to give us certain motivations, e.g. finding our internal processing mysterious, nonphysical or interesting makes it seem special to us, and this makes us
value sensations for their own sake, so seek sensations and engage in sensory play, which may help us learn more about ourselves or the world (according to Nicholas Humphrey, as discussed here, here and here),
value our lives more and work harder to prevent early death, and/or
develop spiritual or moral beliefs and adaptive associated practices,
Consciousness is just the illusion of the phenomenality of what’s introspectively accessible to us. Furthermore, we might incorrectly believe in its phenomenality just because of the fact that much of the processing we have introspective access to is wired in and its causes are not introspectively accessible, but instead cognitively impenetrable. The full illusion could be a special case of humans incorrectly using supernatural explanations for unexplained but interesting and subjectively important or profound phenomena.
I’ve been thinking about this comment every day since you made it 11 days ago. I love it. Maybe it’s silly of me, but I just hadn’t thought about the question in such a grounded empirical manner before.
I agree with you that it seems unlikely that current transformer-based LLMs are conscious. I also agree that we would need to be able to find extra context-dependent computation present in the stream of calculations in order to say that there was some consciousness-related computation present.
I also agree that it is hard to imagine how consciousness would provide a clear benefit on the task of next-token-prediction on web text.
I disagree though on the extrapolation from the above points. Let me explain.
Assume, for this hypothetical, that we are analyzing a future model which has some things in common with transformer-based LLMs but also some extra components. We can get into the details of plausibly useful extra components if you like, but for now let’s just say that this is a diffusion-guided transformer as an example. Now let’s also assume that this future model wasn’t trained on web text, but was instead trained in some moderately realistic simulation of surviving in the wild as an early homonid tribe member. They need to track simulated hunger, hunting and gathering skills, and social relationships. They had a constant simulated state of health/homeostasis throughout training, as an RL signal proportional to intensity of simulated need. So there was a constant combination of training pressure for next token prediction and for satisficing the simulated state homeostasis.
Now, in this hypothetical, it seems more fair to compare this model to an animal. Supposing that the intuitive understanding of a common feature of behaviors across animal species (particularly mammals, marsupials, and birds) is correct. It seems like all these animals are running some sort of computational process which could fairly be described as a form of ‘consciousness’. Why would this be a common computational process evolved and maintained across many species if it weren’t useful in some way? Neural computation is expensive. Especially so for flighted birds. Yet some flighted birds, like corvids, seem both conscious and remarkably intelligent. Relatedly, they can be reasonably be described as curious, playful, puzzle-solving, and with detailed long-lasting memories. Since consciousness seems useful for all these different species, in a convergent-evolution pattern even across very different brain architectures (mammals vs birds), then I believe we should expect it to be useful in our homonid-simulator-trained model. If so, we should be able to measure this difference to a next-token-predictor trained on an equivalent number of tokens of a dataset of, for instance, math problems.
Do you agree? Am I missing something?
Sorry for the late response. I don’t really use this forum regularly. But to get back to it—the main reason neural networks generalize is that they find the simplest function that gets a given accuracy on the training data.
This holds true for all neural networks, regardless of how they are trained, what type of data they are trained on, or what the objective function is. It’s the whole point of why neural networks work. Functions that have more high frequency components are exponentially more unlikely. This holds for the randomly initialized prior (see arxiv.org/pdf/1907.10599) and throughout training, as the averaging part of SGD allows lower frequency components to be learned faster than higher frequency ones (see [1806.08734] On the Spectral Bias of Neural Networks).
You can have any objective function you want; it doesn’t change this basic fact. If this basic fact didn’t hold, the neural network wouldn’t generalize and would be useless. There are many papers that formalize this and provide generalization bounds based off of the complexity of the function learned by the neural network.
A “conscious” neural network doesn’t increase the accuracy over a neural network encoding the same function sans consciousness but does increase the complexity of the function. Therefore, it’s exponentially more unlikely.
I think biological systems are really different from silicon ones. The biggest difference is that biological systems are able to generate their own randomness. Silicon ones are not—they’re deterministic. If a NN is probabilistic, it’s because we are feeding it random samples as an input. I think consciousness is a precursor for free will, which can be valuable for inherently non-deterministic biological systems.
In my original post, I had linked a recent paper that finds suggestive evidence that the brain is non-classical (e.g. undergoes quantum computation) but deleted it after someone told me to.
More generally, I feel that for folks concerned about AI safety, the first step is to develop a solid theoretical understanding of why neural networks generalize, the types of functions they are biased towards, how this bias is affected by the # of layers, etc.
I feel that most individuals on Less Wrong lack this knowledge because they exclusively consume content from individuals within the rationality/AI safety sphere. I think this leads to a lot of outlandish conjectures (e.g. AI conscious, paperclip maximizer, etc.) that don’t make sense.
What do you mean by difference here? Increase in performance due to consciousness? Or differences in functions?
I’m not sure we could measure this difference. It seems very likely to me that consciousness evolved before, say, language and complex agency. But complex language and complex agency might not require consciousness, and may capture all of the benefits that would be captured by consciousness, so consciousness wouldn’t result in greater performance.
However, it could be that
humans do not consistently have complex language and complex agency, and humans with agency are fallible as agents, so consciousness in most humans is still useful to us as a species (or to our genes),
building complex language and complex agency on top of consciousness is the locally cheapest way to build them, so consciousness would still be useful to us, or
we reached a local maximum in terms of genetic fitness, or evolutionary pressures are too weak on us now, and it’s not really possible to evolve away consciousness while preserving complex language and complex agency. So consciousness isn’t useful to us, but can’t be practically gotten rid of without loss in fitness.
Some other possibilities:
The adaptive value of consciousness is really just to give us certain motivations, e.g. finding our internal processing mysterious, nonphysical or interesting makes it seem special to us, and this makes us
value sensations for their own sake, so seek sensations and engage in sensory play, which may help us learn more about ourselves or the world (according to Nicholas Humphrey, as discussed here, here and here),
value our lives more and work harder to prevent early death, and/or
develop spiritual or moral beliefs and adaptive associated practices,
Consciousness is just the illusion of the phenomenality of what’s introspectively accessible to us. Furthermore, we might incorrectly believe in its phenomenality just because of the fact that much of the processing we have introspective access to is wired in and its causes are not introspectively accessible, but instead cognitively impenetrable. The full illusion could be a special case of humans incorrectly using supernatural explanations for unexplained but interesting and subjectively important or profound phenomena.