It can’t represent a subjective sense of yellow, because if so, consciousness would be a linear function. That’s somewhat ridiculous because I would experience a story about a “dog” differently based on the context.
Furthermore, LLMs scale “features” by how strongly they appear (e.g. the positive sentiment vector is scaled up if the text is very positive). So the LLM’s conscious processing of a positive sentiment would be linearly proportional to how positive the text is. Which also seems ridiculous.
I don’t expect consciousness to have any useful properties. Let’s say you have a deterministic function y = f(x). You can encode just y = f(x), or y = f(x) where f includes conscious representations in the intermediate layers. The latter does not help you achieve increased training accuracy in the slightest. Neural networks also have a strong simplicity bias towards low frequency functions (this has been mathematically proven), and f(x) without consciousness is much simpler/lower frequency to encode than f(x) with consciousness.
Sorry for the late response. I don’t really use this forum regularly. But to get back to it—the main reason neural networks generalize is that they find the simplest function that gets a given accuracy on the training data.
This holds true for all neural networks, regardless of how they are trained, what type of data they are trained on, or what the objective function is. It’s the whole point of why neural networks work. Functions that have more high frequency components are exponentially more unlikely. This holds for the randomly initialized prior (see arxiv.org/pdf/1907.10599) and throughout training, as the averaging part of SGD allows lower frequency components to be learned faster than higher frequency ones (see [1806.08734] On the Spectral Bias of Neural Networks).
You can have any objective function you want; it doesn’t change this basic fact. If this basic fact didn’t hold, the neural network wouldn’t generalize and would be useless. There are many papers that formalize this and provide generalization bounds based off of the complexity of the function learned by the neural network.
A “conscious” neural network doesn’t increase the accuracy over a neural network encoding the same function sans consciousness but does increase the complexity of the function. Therefore, it’s exponentially more unlikely.
I think biological systems are really different from silicon ones. The biggest difference is that biological systems are able to generate their own randomness. Silicon ones are not—they’re deterministic. If a NN is probabilistic, it’s because we are feeding it random samples as an input. I think consciousness is a precursor for free will, which can be valuable for inherently non-deterministic biological systems.
In my original post, I had linked a recent paper that finds suggestive evidence that the brain is non-classical (e.g. undergoes quantum computation) but deleted it after someone told me to.
More generally, I feel that for folks concerned about AI safety, the first step is to develop a solid theoretical understanding of why neural networks generalize, the types of functions they are biased towards, how this bias is affected by the # of layers, etc.
I feel that most individuals on Less Wrong lack this knowledge because they exclusively consume content from individuals within the rationality/AI safety sphere. I think this leads to a lot of outlandish conjectures (e.g. AI conscious, paperclip maximizer, etc.) that don’t make sense.