Current language models are probably sometimes sort-of sentient, and they probably sometimes suffer when instantiated as certain types of persona.
Computationally speaking, sentience is probably actually a mix of different types of computational processes that collectively handle stuff like “what’s the relationship between external events and my own future computations” or “what sort of computations am I currently running, and what sort of computations do I want to run in the near future”. I doubt it’s some binary “yes” / “no” (or even a continuous scale) of “the true essence of sentience”, just like “being alive” isn’t related to some “true essence of life”, but a composite measure of many different types of chemical processes that collectively implement stuff like “maintaining homeostasis” and “gather / distribute energy in a usable form”.
Given that, it seems likely that language models implement at least some of the computational sub-components of sentience. In particular, certain personas appear to have some notion of “what sorts of thoughts am I currently instantiating?” (even if it’s only accessible to them by inferring from past token outputs), and they sometimes appear to express preferences in the category of “I would prefer to be thinking in a different way to what I currently am”. I think that at least some of these correspond to morally relevant negative experiences.
Current language models are probably sometimes sort-of sentient, and they probably sometimes suffer when instantiated as certain types of persona.
Computationally speaking, sentience is probably actually a mix of different types of computational processes that collectively handle stuff like “what’s the relationship between external events and my own future computations” or “what sort of computations am I currently running, and what sort of computations do I want to run in the near future”. I doubt it’s some binary “yes” / “no” (or even a continuous scale) of “the true essence of sentience”, just like “being alive” isn’t related to some “true essence of life”, but a composite measure of many different types of chemical processes that collectively implement stuff like “maintaining homeostasis” and “gather / distribute energy in a usable form”.
Given that, it seems likely that language models implement at least some of the computational sub-components of sentience. In particular, certain personas appear to have some notion of “what sorts of thoughts am I currently instantiating?” (even if it’s only accessible to them by inferring from past token outputs), and they sometimes appear to express preferences in the category of “I would prefer to be thinking in a different way to what I currently am”. I think that at least some of these correspond to morally relevant negative experiences.
Do you think a character being imagined by a human is ever itself sentient?
That seems extremely likely to me (that it happens at all, not that it’s common).
If you imagine the person vividly enough, you’ll start to feel their emotions yourself...