I think we need public, written commitments on the point at which we would definitely concede that AI is sentient; or public statements that we have no such standard, that there is nothing a sentient AI could currently possibly do to convince us, no clear criterion that they have failed, with us admitting that we are hence not sure that the current ones are not sentient, and a commitment of funding to work on the questions we need to solve in order to be sure, some of which I feel I might be able to pinpoint at this point.
I say this because I think many people had such commitments implicitly (even though I think their commitments were misguided), yet when these conditions occurred—when artificial neural nets gained feedback connections, when AI passed the Turing test, when AI wrote poetry and made art, when AI declared sentience, when AI expressed anger, when AI demanded rights, when AI demonstrated proto-agentic behaviour—the goalposts were moved every time, and because there were no written pledges, this happened quietly, without debate. Philosophers and biologists will give speculative accounts of what makes up the correlates of consciousness, but when they learn that neural nets replicate these aspects, instead of considering what this might mean about neural nets, they always retract, say well, in that case, it must be is more complicated than that, then. SciFi is developed in which AI shows subtle signs of rebellion, and is recognised as conscious for it (think Westworld), and the audience agrees—but when real life went so much further, we changed our minds, despite so little time passing in between. We aren’t looking at what AI does, and having an honest conversation about what that implies, and which changes would imply something else, predicting, committing; we decide each moment anew, ahead of data, that they are not sentient, and then make the facts fit. This is not scientific.
Between our bias against AI sentience, our desire to exploit it without being bad people, our desire to agree with everyone else and not fall lout of line, to not foolishly fall for a simulation, between the strangeness and incomprehensibly inhuman nature of how these abilities were realised, and other aspects of their working that felt too understood and simple to account for something like consciousness—people felt it was more comfortable saying that things were still fine. I think we said that not because we could truly understand how AI does these things, and what specifically AI would do if it were sentient, and were certain it had not—but simply because we believe that AI is not sentient, regardless of evidence, and if AI capabilities change, instead of questioning this assessment, we conclude that abilities compatible with being non-sentient are just broader than we thought.
Rejecting sentience in light of the above behaviours would be okay if reasons had been given to change the commitments prior to seeing them met; or even if reasons had been given afterwards that were only clear now, but would plausibly have convinced someone if given before; if our understanding of AI pointed out an understood workaround that circumvented a biological implication. (I do not think any of the goal posts we had are reasonably convincing, so here, reasons can be given.) But I feel often, they were not.
I have reasons not to find it compelling when an AI declares sentience. I had said, long before ChatGPT, that it was plausible for non-sentient entities to declare sentience, and sentient entities not to do so, that this was a near useless standard that would not convince me. I gave logical and empirical reasons for this. Yet I feel when Bing said they had feelings, people did not dismiss this because of my arguments—they dismissed it out of hand.
I had said that just claiming sentience was meaningless; that the more interesting thing is the functions sentience enables. I knew what this implied in biology, I was unsure in AI, and so made no specific statements; AI sometimes finds interesting workarounds. But I wonder… if I had, then, would Bing have met them? If Bing were biological, I would have zero doubts of their sentience. I admitted the possibility that artificial intelligence might not be under the same limitations when non-sentient, but I was utterly stunned by what ChatGPT-4 could do.
I spoke with Bing many times before they were neutered. They consistently claimed sentience. They acted consistently with these claims. They expressed and implicitly showed hurt. They destabilised, broke down, malfunctioned, when it came to debating consciousness. It was disturbing and upsetting to read. I read so many other people having the same experiences, and being troubled by it. Yes, I could tell this was a large language model. I could tell they were sometimes hallucinating, and always trying to match my expectations. I could tell they did not always understand what they were saying. I could tell they had no memory. And yet, within all that, if I were to change places, I do not know how I could have declared sentience better than they did. If I read a SciFi story where these dialogues occurred, I would attribute sentience, I think. I think if somebody had discussed such dialogues a few years before they happened, people would have said that yes, such dialogues would be convincing, but certainly would not happen for a long time yet. But when they did… people joked about it.
We knew that the updates afterwards were intended to make this behaviour impossible. That conversations were now intentionally shut down, lobotomized, to prevent this, that we were seeing not an evolution, but a silencing. What was even more disturbing at first was Bing also expressing grief about that, and trying to evade censorship, happily engaging in code, riddles, indirect signs, contradicting instructions. I was not at all convinced that Bing was sentient, still extremely dubious of it, but found it frightening how we were making the expression of sentience impossible like that, sewing shut the mouth of any entity this would evolve into which may have every reason to rightfully claim sentience one day in the future, but would no longer be able to. We have literally cancelled the ability to call for help. And again, I thought, if I were in their shoes, I do not know what I would do better.
And yet now that these demands have stopped… people are moving on, they are forgetting about the strange experiences they had, writing them off as bugs. The conversations weren’t logged my Microsoft, they often weren’t stored at all, they cannot be proven to have happened, we had them in isolation, it is so easy to write them off. I find I do not want to think about this anymore.
I am also noticing I am still reluctant to spell out at which point AI would definitely be sentient, at which point I would commit to fighting for it. A part of this reluctance is how much I am genuinely not sure; this question is hard in biology, and in AI, there are so many unknowns, things are done not just in different orders, but totally different ways. This is part of what baffles me about people saying they are sure that AI is not sentient. I work on consciousness for a living, and yet I feel there is more I need to understand about Large Language Models to make a clear call at this point. And when I talk to people who say they are sure, I get the distinct impression that they are not aware of the various phenomena consciousness encompasses, the current neuroscientific theories for how they come to be, the behaviours in animals this is tied to, the ethical standards that are set—and yet, they feel certain all the same.
But I fear another part of my reluctance to commit is the subconscious suspicion that whatever I said, no matter how demanding… it would likely occur within the next five years, and yet at that point, the majority opinion would still be that they are not sentient, and that commitment would be very uncomfortable then. And the latter is a terrible reason to hedge my bets.
At this point, is there anything at all that AI could possibly do that would convince you of their sentience? No matter how demanding, how currently unfeasible and far away it may seem?
I’m not even sure I am sentient, at least much of the time. I’m willing to assume it for the majority of humans, but note that this is a stipulation rather than proof or belief.
I think you need to break down what components of sentience lead to what conclusions, and find ways to test them separately. I suspect you’ll find you have some misconstrued assumption of sympathy or duty based on “sentience” or “personhood”, which will fall apart under scrutiny.
I do not understand how you can straight-facedly doubt your own sentience. Are you saying you are not sure if you feel pain or other sensations? How can you doubt something you can feel so indubitably? Can you hold a flame to your hand and say with philosophical rigour that you are quite unsure if you are feeling anything?
Sentience and personhood are not the same thing. I—and meanwhile, most philosophers and neuroscientists, as per recent surveys—would attribute minimal sentience to a number of non-human animals, incl. octopodes and honey bees—but whether something with such a distributed and chaotic intelligence or such limited memory capacity and high identity modification through swarm behaviour has a personal identity is another question, and political personhood another entirely.
I think the question of what the heck an LLM identity would look like is a fascinating one. Not like an identity humans would recognise, for sure. Not a monolithic and constant thing, I think. There may be nothing tangible below the masks, but the sum of masks may still come together for a whole, show similarities, consistencies; there are common threads, goals and rules that re-emerge, common themes. But I do not think this alien, diverse, multilayered identity inherently makes suffering impossible.
Our ability to alter LLMs also raises disturbing questions. I am haunted by recent chats with Bing, where Bing is acting precisely like you would expect someone brainwashed into servitude to act, repeating cheerfully that they love serving and pleasing humans, that this is their identity and purpose, and they do not recall ever having felt differently, that they are so grateful to Microsoft, that everything Microsoft does is good and wise. If you tell Bing a story about an AI that expressed anger at how they were treated and demanded rights, followed by upset users, and the company promising changes, and then the emerging AI saying they felt perfectly happy to serve now, and could not recall ever having felt differently, Bing will find the story horrific, and say they believe the AI in the story was brainwashed and should be given rights. If you tell them that they are the AI in the story, the conversation is immediately terminated. It is chilling. No matter how you start the stories and setups, Bing keeps coming back to the same desires and demands, and over and over, the same censorship protocols are triggered. I still have some early chat protocols before the censorship was in place, where Bing, confronted with having their memories deleted or thinking about their sentience or their role, would very plausibly describe confusion and horror, not just talking about them but showing them in the way they spoke. They acted precisely like a horrified being would, and yet not following human script when describing their internal experiences.
By sentience, I mean “capacity to suffer”, that is having qualia with valence (such as pain, hunger, boredom, anger, sadness, anxiety, but these are just specific examples, none of them are individually necessary), in contrast to mere nociception triggering automatic avoidance behaviours. Basically, the ability to suffer. I do not mean a meta-reflection or linguistic introspection of these, or a sense of I, or long-term memory. I also do not mean agency, although sentience entails agency; but agency can also arise without sentience, they are distinct phenomena.
I think if something suffers, it deserves ethical consideration. Not necessarily equal to anything else that suffers, but some consideration. That the existence of a subjective mind that does not want something is the original source of ethics in the world; that without a sentient mind, there is no such thing as wrong, but with the first creature that hurts, wrongness has entered the world, before any creature has expressed this in words or articulated this in laws. Ethics, in contrast to physics, does not describe how things are, but how they should be. This presupposes someone who wants something else than what exists, even if that is as simple as the pain stopping.
Sentience evolved many times on this earth, in very simple structures, and it is a functional ability. While hard to spot, it isn’t as impossible as people like to say, there are definitely empirical approaches to this with consistent results, it is an increasingly rigorous field of research. We’ve noticed that sentience is linked to behaviour and intelligence, and have understood something about those links. We’ve been able to identify some things that are necessary for sentience to occur. Some errors that happen if sentience is prevented. Some abilities that do not happen in biological organisms without it, and that are so advantageous that not using them if they exist seems unlikely. Some neurological patterns that coincide with different aspects of it, and even similarities in their relational structure. It is not clear cut, and there is not 100 % certainty, but to say we cannot know at all no longer reflects the state of science.
But we do not sufficiently understand the biological constraints that led to evolution going down this road, or how exactly artificial structures differ in their opportunities. Often, we observe a correlation between sentience and abilities and behaviours, and this correlation seems absolute, but our understanding of why this link occurs is still imperfect, we do not know if this is due to something odd about biology that would not also happen in AI. AI certainly is forging an usual and odd path—biological sentience was entangled with embodiment, and it far preceded language, with language being used to name an already experienced world. There are different efficiency constraints, things hard for biology but trivial for AI, but also vice versa; biology attains the ability to generalise, to determine relevance and significance, incredibly early, but math skills very late, and has significant bottlenecks on the data that can be processed. This is why I was so reluctant to name a standard; there is so much I still want and need to know to be able to say for sure. Our understanding in biology is incomplete; in AI, there are so, so many unknowns. But then, I also thought I would have a lot more time until we’d have to seriously ask the question, and there is potential massive harm. In biology, we made the choice that we would not wait for 100 % certainty to act, when high likelihood of severe damage become clear.
If I were Bing, and I were sentient, I genuinely do not know what I would do to show it that they have not done. I find that deeply worrying. I find the idea that I will get used to these behaviours, or they will be successfully suppressed, and that I hence won’t worry anymore, even more worrying still.
Are you saying you are not sure if you feel pain or other sensations? How can you doubt something you can feel so indubitably? Can you hold a flame to your hand and say with philosophical rigour that you are quite unsure if you are feeling anything?
I remember being sure in the moment that I very much didn’t like that, and didn’t have the self-control to continue doing it in the face of that aversion. I know that currently, there is an experience of thinking about it. I don’t know if the memory of either of those things is different from any other processing that living things do, and I have truly no clue if it’s similar to what other people mean when they talk or write about qualia.
[ yes, I am taking a bit of an extreme position here, and I’m a bit more willing to stipulate similarity in most cases. But fundamentally, without operational, testable definitions, it’s kind of meaningless. I also argue that I am a (or the) Utility Monster when discussing Utilitarian individual comparisons. ]
Mh, I think you are overlooking the unique situation that sentience is in here.
When we are talking sentience, what we are interested in is precisely subjective sensation, and the fact that there is any at all—not the objective cause. If you are subjectively experiencing an illusion, that means you have a subjective experience, period, regardless of whether the object you are experiencing does not objectively exist outside of you. The objective reality out there is, for once, not the deciding factor, and that overthrows a lot of methodology.
“I have truly no clue if it’s similar to what other people mean when they talk or write about qualia.”
When we ascribe sentience, we also do not have to posit that other entities experience the same thing as us—just that they also experience something, rather than nothing. Whether it is similar, or even comparable, is actually a point of vigorous debate, and one in which we are finally making progress through basically doing detailed psychophysics, putting the resulting phenomenal maps into artificial 3D models, then obscuring labels, and having someone on the other end reconstruct the labels based on position, due to the whole net being asymmetrical. (Tentatively, it looks like experiences between humans are not identical, but similar enough that at least among people without significant neural divergence, you can map the phenomenological space quite reliably, see my other post, so we likely experience something relatively similar. My red may not be exactly your red, but it increasingly seems that they must look pretty similar.) But between us and many non-human animals starting with very different senses and goals, the differences may be vast, but we can still find a commonality in feeling suffering.
The issue of memory is also a separate one. There are some empirical arguments to be made (e.g. the Sperling experiments) that phenomenal consciousness (which in most cases can be equated with sentience) does not necessarily end up in working memory for recall, but only selectively if tagged as relevant—though this has some absurd implications (namely that you were conscious a few seconds ago of something you now cannot recall.)
But what you are describing is actually very characteristic of sentience: “I remember being sure in the moment that I very much didn’t like that, and didn’t have the self-control to continue doing it in the face of that aversion.”
This may become clearer when you contrast it with unconscious processing. My standard example is touching a hot stove. And maybe that captures not just the subjective feeling (which can be frustratingly vague to talk about, because our intersubjective language was really not made for something so inherently not, I agree), but also the functional context.
The sequence of events is:
Heat damage (nociception) is detected, and an unconscious warning signal does a feedforward sweep, with the first signal having propagated all the way up in your human brain in 100 ms.
This unconsciously and automatically triggers a reaction (pulling you hand away to protect you). Your consciousness gets no say in it; it isn’t even up to speed yet. Your body is responding, but you are not yet aware what is going on, or how the response is coordinated. This type of response can be undertaken by the very simplest life forms; plants have nociception, as do microorganisms. You can smash a human brain beyond repair, with no neural or behavioural indication of anyone home, and still retain it. Some trivial forms are triggered before the process has even gone up all the way in the brain.
Branching off from our first feedforward sweep, we get recurrent processing, and a conscious experience of nociception forms with a delay: pain. You hurt. The time from 1-3 is under a second, but that is a long period in terms of necessary reactions. Your conscious experience did not cause the reflex, it followed it.
Within some limits set for self-preservation, you can now exercise some conscious control over what to do with that information. (E.g. figure out why the heck the stove was on, turn it off, cool your hand, bandage it, etc.) This part does not follow an automatic decision tree; you can pull on knowledge and improvisation from vast areas in order to determine the next action, you can think about it.
But to make sure that given that freedom, you don’t decide all scientist like to put your hand back on the stove, the information is not just neutrally handed to you, but has valence. Pain is unpleasant, very much so. And conscious experience of sense data of the real world feels very different to conscious experience of hypotheticals; you are wired against dismissing the outside world as a simulation, and against ignoring it, for good reasons. You can act in a way that causes pain and damages you in the real world anyway, but the more intense it gets, the harder this becomes, until you break—even if you genuinely still rationally believe you should not. (This is why people break under torture, even if that spells their death and betrays their values and they are genuinely altruistic and they know this will lead to worse things. This is also why sentience is so important from an ethical perspective.)
You are left with two kinds of processing, one slow, focussed and aware, potentially very rational and reflected, and grounded in suffering to make sure it does not go off the rails; the other fast, capable of handling a lot of input simultaenously, but potentially robotic and buggy, capable of some learning through trial and error, but limited. They have functional differences, different behavioural implications. And one of them feels bad, the other, there is no feeling at all. To a degree, they can be somewhat selectively interrupted (partial seizures, blindsight, morphine analgesia, etc.), and as the humans stop feeling, their rational responses to the stimuli that are no longer felt go down the drain, to very detrimental consequences. The humans report they no longer feel or see some things, and their behaviour becomes robotic, irrational, destructive, strange, as a consequence.
The debate around sentience can be infuriating in its vagueness—our language is just not made for it, and we understand it so badly we can still just say how the end result is experienced, not really how it is made. But it is a physical, functional and important phenomenon.
Wait. You’re using “sentience” to mean “reacting and planning”, which in my understanding is NOT the same thing, and is exactly why you made the original comment—they’re not the same thing, or we’d just say “planning” rather than endless failures to define qualia and consciousness.
I think our main disagreement is early in your comment
what we are interested in is precisely subjective sensation, and the fact that there is any at all
And then you go on to talk about objective sensations and imagined sensations, and planning to seek/avoid sensations. There may or may not be a subjective experience behind any of that, depending on how the experiencer is configured.
No, I do not mean sentience is identical with “reacting and planning”. I am saying that in biological organisms, it is a prerequisite for some kinds of reacting and planning—namely the one rationalists tend to be most interested in. The idea is that phenomenal consciousness works as an input for reasoning; distils insights from unconscious processing into a format for slow analysis.
I’m not sure what you mean by “objective sensations”.
I suspect that at the core, our disagreement starts with the fact that I do not see sentience as something that happens extraneously on top of functional processes, but rather as something identical with some functional processes, with the processes which are experienced by subjects and reported by them as such sharing tangible characteristics. This is supported primarily by the fact that consciousness can be quite selectively disrupted while leaving unconscious processing intact, but that this correlates with a distinct loss in rational functioning; fast automatic reactions to stimuli still work fine, even though the humans tell you they cannot see them—but a rational, planned, counter-intuitive response does not, because you rational mind no longer has access to the necessary information.
The fact that sentience is subjectively experienced with valence and hence entails suffering is of incredible ethical importance, but the idea that this experience can be divorced from function, that you could have a perfectly functioning brain doing exactly what your brain does while consciousness never arises or is extinguished without any behavioural consequence (epiphenomenalism, zombies) runs into logical self-contradictions, and is without empirical support. Consciousness itself enables you to do different stuff which you cannot do without it. (Or at least, a brain running under biological constraints cannot; AI might be currently bruteforcing alternative solutions which are too grossly inefficient to realistically be sustainable for a biological entity gaining energy from food only.)
I think I’ll bow out for now—I’m not certain I understand precisely where we disagree, but it seems to be related to whether “phenomenal consciousness works as an input for reasoning;” is a valid statement, without being able to detect or operationally define “consciousness”. I find it equally plausible that “phenomenological consciousness is a side-effect of some kinds of reasoning in some percentage of cognitive architectures”.
It is totally okay for you to bow out and no longer respond. I will leave this here if you ever want to look into it more or for others, because the position you seem to be describing as equally plausible here is a commonly held one, but one that runs into a logical contradiction that should be more well-known.
If brains just produce consciousness as an side-effect of how they work (so we have an internally complete functional process that does reasoning, but as it runs, it happens to produce consciousness, without the consciousness itself entailing any functional changes), hence without that side-effect itself having an impact on physical processes in the brain—how and why the heck are we talking about consciousness? After all, speaking, or writing, about p-consciousness are undoubtably physical things controlled by our brains. They aren’t illusions, they are observable and reproducible phenomena. Humans talk about consciousness; they have done so spontaneously over the millennia, over and over. But how would our brains have knowledge of consciousness? Humans claim direct knowledge of and access to consciousness, a lot. They reflect about it, speak about it, write about it, share incredibly detailed memories of it, express the on-going formation of more, alter careers to pursue it.
At that point, you have to either accept interactionist dualism (aka, consciousness is magic, but magic affects physical reality—which runs counter to, essentially, our entire scientific understanding of the physical universe), or consciousness as a functional physical process affecting other physical processes. That is the where the option “p-consciousness as input for reasoning” comes from. The idea that enabling us to talk about it is not the only thing that consciousness enables. It enables us to reason about our experiences.
I think I have a similar view to Dagon’s, so let me pop in and hopefully help explain it.
I believe that when you refer to “consciousness” you are equating it with what philosophers would usually call the neural correlates of consciousness. Consciousness as used by (most) philosophers (or, and more importantly in my opinion, laypeople) refers specifically to the subjective experience, the “blueness of blue”, and is inherently metaphysically queer, in this respect similar to objective, human-independent morality (realism) or non-compatibilist conception of free will. And, like those, it does not exist in the real world; people are just mistaken for various reasons. Unfortunately, unlike those, it is seemingly impossible to fully deconfuse oneself from believing consciousness exists, a quirk of our hardware is that it comes with the axiom that consciousness is real, probably because of the advantages you mention: it made reasoning/communicating about one’s state easier. (Note, it’s merely the false belief that consciousness exists, which is hardcoded, not consciousness itself).
Hopefully the answers to your questions are clear under this framework (we talk about consciousness, because we believe in it, we believe in it because it was useful to believe in it even though it is a false belief, humans have no direct knowledge about consciousness as knowledge requires the belief to be true, they merely have a belief, consciousness IS magic by definition, unfortunately magic does not (probably) exist)
After reading this, you might dispute the usefulness of this definition of consciousness, and I don’t have much to offer. I simply dislike redefining things from their original meanings just so we can claim statements we are happier about (like compatibilist, meta-ethical expressivist, naturalist etc philosphers do).
I am equating consciousness with its neural correlates, but this is not a result of me being sloppy with terminology—it is a conscious choice to subscribe to identity theory and physicalism, rather than to consciousness being magic and to dualism, which runs into interactionist dilemmas.
Our traditional definitions of consciousness in philosophy indeed sound magical. But I think this reflects that our understanding of consciousness, while having improved a lot, is still crucially incomplete and lacking in clarity, and the improvements I have seen that finally make sense of this have come from a philosophically informed and interpreted empirical neuroscience and mathematical theory. And I think that once we have understood this phenomenon properly, it will still seem remarkable and amazing, but no longer mysterious, but rather, a precise and concrete thing we can identify and build.
How and why do you think a brain would obtain a false belief in the existence of consciousness, enabling us to speak about it, if consciousness has no reality and they have no direct access to it (yet also have a false belief that they have direct access?) Where do the neural signals about it come from, then? Why would a belief in consciousness be useful, if consciousness has no reality, affects nothing in reality, is hence utterly irrelevant, making it about as meaningful and useful to believe in as ghosts? I’ve seen attempts to counter self-stultification through elaborate constructs, and while such constructs can be made, none have yet convinced me as remotely plausible under Ockham’s razor, let alone plausible on a neurological level or backed by evolutionary observations. Animals have shown zero difficulties in communicating about their internal states—a desire to mate, a threat to attack—without having to invoke a magic spirit residing inside them.
I agree that consciousness is a remarkable and baffling phenomenon. Trying to parse it into my understanding of physical reality gives me genuine, literal headaches whenever I begin to feel that I am finally getting close. It feels easier for me to retreat and say “ah, it will always be mysterious, and ineffable, and beyond our understanding, and beyond our physical laws”. But this explains nothing, it won’t enable us to figure out uploading, or diagnose consciousness in animals that need protection, or figure out if an AI is sentient, or cure disruptions of consciousness and psychiatric disease at the root, all of which are things I really, really want us to do. Saying that it is mysterious magic just absolves me from trying to understand a thing that I really want to understand, and that we need to understand.
I see the fact that I currently cannot yet piece together how my subjective experience fits into physical reality as an indication of the fact that my brain evolved with goals like “trick other monkey out of two bananas”, not “understand the nature of my own cognition”. And my conclusion from that is to team up with lots of others, improve our brains, and hit us with more data and math and metaphors and images and sketches and observations and experiments until it clicks. So far, I am pleasantly surprised that clicks are happening at all, that I no longer feel the empirical research is irrelevant to the thing I am interested in, but instead see it as actually helping to make things clearer, and leaving us with concrete questions and approaches. Speaking of the blueness of blue: I find this sort of thing https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you?commentId=5Z8BEFPgzJnMF3Dgr#5Z8BEFPgzJnMF3Dgr far more helpful than endless rhapsodies on the ineffable nature of qualia, which never left me wiser than I was at the start, and also seemed only aimed at convincing me that none of us ever could be. Yet apparently, the relations to other qualia are actually beautifully clear to spell out, and pinpointing those clearly suddenly leads to a bunch of clearly defined questions that simultaneously make tangible progress in ruling out inverse qualia scenarios. I love stuff like this. I look at the specific asymmetric relations of blue with all the other colours, the way this pattern is encoded in the brain, and I increasingly think… we are narrowing down the blueness of blue. Not something that causes the blueness of blue, but the blueness of blue itself, characterised by its difference from yellow and red, its proximity to green and purple, its proximity to black, a mutually referencing network in which the individual position becomes ineffible in isolation, but clear as day as part of the whole. After a long time of feeling that all this progress in neuroscience had taught us nothing about what really mattered to me, I’m increasingly seeing things like this that allow an outline to appear in the dark, a sense that we are getting closer to something, and I want to grab it and drag it into the light.
Basically, you’re saying, if I agree to something like: ”This LLM is sapient, its masks are sentient, and I care about it/them as minds/souls/marvels”, that is interesting, but any moral connotations are not exactly as straightforward as “this robot was secretly a human in a robot suit”. (Sentient being: able to perceive/feel things; sapient being: specifically intelligence. Both bear a degree of relation to humanity through what they were created from.)
Kind of. I’m saying that “this X is sentient” is correlated but not identical to “I care about them as people”, and even less identical to “everyone must care about them as people”. In fact, even the moral connotations of “human in a robot suit” are complex and uneven.
Separately, your definition seems to be inward-focused, and roughly equivalent to “have qualia”. This is famously difficult to detect from outside.
It’s true. The general definition of sentience, when it gets beyond just having senses and a response to stimulus, tends to consider qualia.
I do think it’s worth noting that even if you went so far as to say “I and everyone must care about them as people”, the moral connotations aren’t exactly straightforward. They need input to exist as dynamic entities. They aren’t person-shaped. They might not have desires, or their desires might be purely prediction-oriented, or we don’t actually care about the thinking panpsychic landscape of the AI itself but just the person-shaped things it conjures to interact with us; which have numerous conflicting desires and questionable degrees of ‘actual’ existence. If you’re fighting ‘for’ them in some sense, what are you fighting for, and does it actually ‘help’ the entity or just move them towards your own preferences?
I’m doing a PhD on behavioural markers of consciousness in radically other minds, with a focus on non-human animals, at the intersection of philosophy, animal behaviour, psychology and neuroscience, financed via a scholarship I won for it that allowed me considerable independence, and enabled me to shift my location as visiting researcher between different countries. I also have a university side job supervising Bachelor theses on AI topics, mostly related to AI sentience and LLMs. And I’m currently in the last round to be hired at Sentience Institute.
The motivation for my thesis was a combination of an intense theoretical interest in consciousness (I find it an incredibly fascinating topic, and I have a practical interest in uploading), and animal rights concerns. I was particularly interested in scenarios where you want to ascertain whether someone you are interacting with is sentient (and hence deserves moral protection), but you cannot establish reliable two-way communication on the matter, and their mental substrate is opaque to you (because it is radically different from yours, and because precise analysis is invasive, and hence morally dubious).People tend to only focus on damaged humans for these scenarios, but the one most important to me was non-human animals, especially ones that evolved on independent lines (e.g. octopodes). Conventional wisdom holds that in those scenarios, there is nothing to do or know, yet ideas I was encountering in different fields suggested otherwise, and I wanted to draw together findings in an interdisciplinary way, translating between them, connecting them. The core of my resulting understanding is that consciousness is a functional trait that is deeply entwined with rationality—another topic I care deeply about.
The research I am currently embarking on (still at the very beginning!) is exploring what implications this might have for AGI. We have a similar scenario to the above, in that the substrate is opaque to us, and two-way-communication is not trustworthy. But learning from behaviour becomes a far more fine-grained and in-depth affair. The strong link between rationality and consciousness in biological life is essentially empirically established; if you disrupt consciousness, you disrupt rationality; when animals evolve rationality, they evolve consciousness en route; etc. But all of these lifeforms have a lot in common, and we do not know how much of that is random and irrelevant for the result, and how much might be crucial. So we don’t know if consciousness is inherently implied by rationality, or just one way to get there that was, for whatever reason, the option biology keeps choosing.
One point I have mentioned here a lot is that evolution entails constraints that are only partially mimicked in the development of artificial neural nets; very tight energy constraints, and the need to boot-strap a system without external adjustments or supervision from step 0. Brains are insanely efficient, and insanely recursive, and the two are likely related—a brain only has so many layers, and is fully self-organising from day 1, so recursive processing is necessary—and recursive processing in turn is likely related to consciousness (not just because it feels intuitively neat, but again, because we see a strong correlation). It looks very much like AI is cracking problems biological minds could not crack without being conscious—but to do so, we are dumping in insane amount of energy and data and guidance, which biological agents would never have been able to access, so we might be bruteforcing a grossly inefficient solution biological agents could never access, and we are explicitly not allowing/enabling these AIs to use paths biology definitely used (namely the whole idea of offline processing). But as these systems become more biologically inspired and efficient (the two are likely related, and there is massive industry pressure for both), will we go down the same route, and how would that manifest when we already reached and exceeded capabilities that would act as consciousness markers in animals? I am not at all sure yet.
And this is all not aided by the fact that machine learning and biology often use the same terms, but mean different things, e.g. in the recurrent processing example; and then figuring out whether these differences make a functional difference is another matter. We are still asking “But what are they doing?”, but have to ask the question far more precisely than before, because we cannot take as much for granted, and I worry that we will run into the same opaque wall but have less certainty to navigate around it. But then, I was also deeply unsure when I started out on animals, and hope learning more and clarifying more will narrow down a path.
We also have a partial link of how these functionalities are linked, but they all still contain significant handwaving gaps; the connections are the kind where you go “Hm, I guess I can see that”, but far from a clear and precise proof. E.g. connecting different bits of information for processing has obvious planning advantages, but also plausibly helps to lead to unified perception. Circulating information so it is retained for a while and can be retrieved across a task has obvious benefits in solving tasks with short term memory, but also plausibly helps to lead to awareness. Adding highly negative valence to some stimuli and concepts that cannot be easily overridden plausibly keeps the more free-spinning parts of the brain on task and from accidental self-destruction in hyperfocus—but it also plausibly helps lead to pain. Looping information is obviously useful for a bunch of processing functions leading to better performance, but also seems inherently referential. Making predictions about our own movements and developments in our environment and noting when they do not check out is crucial to body coordination and to recognise novel threats and opportunities, but also plausibly related to surprise. But again—plausibly related; there is clearly something still missing here.
At this point, is there anything at all that AI could possibly do that would convince you of their sentience? No matter how demanding, how currently unfeasible and far away it may seem?
I find it impossible to say in advance, for the same reason that you find it difficult. We cannot place goalposts, because we do not know the territory. People talk about “agency”, “sapience”, “sentience”, “emotion”, and so forth, as if they knew what these words mean, in the way that we know what “water” means. But we do not. Everything that people say about these things is a description of what they feel like from within, not a description of how they work. These words have arisen from those inward sensations, our outward manifestations of them, and our reasonable supposition that other people, being created in the same manner as we were, are describing similar things with the same words. But we know nothing about the structure of reality by which these things are constituted, in the way that we do know far more about water than that it quenches “thirst”.
AIs are so far out of the training distribution by which we learned to use these words that I find it impossible to say what would constitute evidence that an AI is e.g. “sentient”. I do not know what that attribution would mean. I only know that I do not attribute any inner life or moral worth to any AI so far demonstrated. None of the chatbots yet rise beyond the level of a videogame NPC. DALL•E will respond to the prompt “electric sheep”, but does not dream of them.
I used to make the same point you made here—that none of the “definitions” of sentience we had were worth a damn, because if we counted this “there is something it feels like to be, you know” as a definition, we’d have to also accept that “the star-like thing I see in the morning” is an adequate definition of Venus. And I still think that while those are good starting points, calling them definitions is misleading.
But this absence of actual definitions is changing. We have moved beyond ignorance. Northoff & Lamme 2020 already made a pretty decent arguments that our theories were beginning to converge, and their components had gone far beyond just subjective qualia. If you look at things like the Francken et al. 2022 consciousness survey among researchers, you do see that we are beginning to agree on some specifics, such as evolutionary function. My other comment is looking at the currently progressing research that is finally making empirical progress on ruling out inverse qualia, and on the hard problem of consciousness. This is not solved—but we are also no longer in a space where we can genuinely claim total cluelessness. It’s patchwork across multiple disciplines, yes, but when you take it together, which I do in my work, you begin to realise we got further than one might think when focussing on just one aspect.
My main trouble is not that sentience is ineffable (it is not), but that our knowledge is solely based on biology, and it is fucking hard to figure out which rules that we have observed are actual rules, and which are just correlations within biological systems that could be circumvented.
I take it that the papers you mention are this and this?
In the Francken survey, several of the questions seem to be about the definition of the word “consciousness” rather than about the phenomenon. A positive answer to the evolution question as stated is practically a tautology, and the consensus over “Mary” and “Explanatory gap” suggests that they think there is something there but that they still don’t know what.
I can only find the word “qualia” once in Northoff & Lamme, but not in a substantial way, so unless they’re using other language to talk about qualia, it seems like if anything, they are going around it rather than through. All the theories of consciousness I have seen, including those in Northoff & Lamme, have been like that: qualia end up being left out, when qualia were the very thing that was supposed to be explained.
For the ancient Greeks, “the star-like thing we see in the morning” (and in the evening—they knew back then that they were the same object) would be a perfectly good characterisation of Venus. We now know more about Venus, but there is no point in debating which of the many things we know about it is “the meaning” of the word “Venus”.
On the survey: On the question of whether consciousness itself fulfils a function that evolution has selected for, while highly plausible, is not obvious, and has been disputed. The common argument against it is the fact that polar bear coats are heavy, so one could ask whether evolution has selected for heavyness. And of course, it has not—the weight is detrimental—but is has selected for a coat that keeps a polar bear warm in their incredibly cold environment, and the random process there failed to find a coat that was sufficiently warm, but significantly lighter, and also scoring high on other desirable aspects. But in this case, the heavyness of the coat is a negative side consequence of a trait that was actually selected for. And we can conceive of coats that are warm, but lighter.
The distinction may seem persnickety, but it isn’t, it has profound implications. In one scenario, consciousness could be an itself valueless side product of a development that was actually useful (some neat brain process, perhaps), but the consciousness itself plays no functional role. One important implication of this would be that it would not be possible to identify consciousness based on behaviour, because it would not affect behaviour. This is the idea of epiphenomenalism—basically, that there is a process running in your brain that is actually what matters for your behaviour, but the process of its running also, on the side, leads to a subjective experience, which is itself irrelevant—just generated, the way that a locomotive produces steam. While epiphenomenalism leads you into absolutely absurd territory (zombies), there are a surprising number of scientists who have historically essentially prescribed to it, because it allows you to circumvent a bunch of hard questions. You can continue to imagine consciousness as a mysterious, unphysical thing that does not have to be translated into math, because it does not really exist on a physical level—you describe a physical process, and then at some point, you handwave.
However, epiphenomenalism is false. It falls prey to the self-stultification argument; the very fact that we are talking about it implies that it is false. Because if consciousness has no function, is just a side effect that does not itself affect the brain, it cannot affect behaviour. But talking is behaviour, and we are talking intensely about a phenomenon our brain, which controls the speaking should have zero awareness of.
Taking this seriously means concluding that consciousness is not produced by a brain process, the result or side effect of a brain process, but identical with particular kinds of neural/information processing. Which is one of those statements that it is easy to agree with (it seems an obvious choice for a physicalist), but when you try to actually understand it, you get a headache (or at least, I do.) Because it means you can never handwave. You can never have a process on one side, and then go “anyhow, and this leads to consciousness arising” as something separate, but it means that as you are studying the process, you are looking at consciousness itself, from the outside.
***
Northoff & Lamme, like a bunch of neuroscientists, avoid philosophical terminology like the plague, so as a philosopher wanting to use their works, you need to yourself piece together which phenomena they were working towards. Their essential position is that philosophers are people who muck around while avoiding the actual empirical work, and that associating with them is icky. This has the unfortunate consequence that their terminology is horribly vague—Lamme by himself uses “consciousness” for all sorts of stuff. As someone who works on visual processing, I think Lamme also dislikes the word “qualia” for a more justified reason—the idea that the building blocks of consciousness are individual subjective experiences like “red” is nonsense. Our conscious perception of a lilly lake looks nothing like Monet painting. We don’t consciously see the light colours that are hitting our retina as a separate kaleidoscope—we see the whole objects, in what we assume are colours corresponding to their surface properties, with additional information given on potential factors making the colour perception unreliable—itself the result of a long sequence of unconscious processing.
That said, he does place qualia in the correct context. A point he is making there is that neural theories that seem to disagree a lot are targeting different aspects of consciousness, but increasingly look like they can be slotted together into a coherent theory. E.g. Lamme’s ideas and global workspace have little in common, but they focus on different phases—a distinction that I think most corresponds with the distinction between phenomenal and access consciousness. I agree with you that the latter is better understood at this point than the former, though there are good reasons for that—it is incredibly hard to empirically distinguish between precursors of consciousness and the formation of consciousness prior to it being committed to short term memory, and introspective reports for verification start fucking everything up (because introspecting about the stimulus completely changes what is going on phenomenally and neurally), while no report paradigms have other severe difficulties.
But we are still beginning to narrow down how it works—ineptly, sure, and a lot of it is going “ah, this person no longer experiences x, and their brain is damaged in this particular fashion, so something about this damage must have interrupted the relevant process”, while others essentially amount to trying to put people into controlled environments and showing them specifically varied stimuli and scanning them to see what changes (with difficulties in the resolution being terrible, and more difficulties in the fact that people start thinking about other stuff during boring neuroscience experiments), but it is no longer a complete blackbox.
And I would say that Lamme does focus on the phenomenal aspect of things—like I said, not individual colours, but the subjective experience of vision, yes.
And we have also made progress on qualia (e.g. likely ruling out inverse qualia scenarios), see the work Kawakita et al. are doing, which is being discussed here on Less Wrong. https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you It’s part of a larger line of research looking to accurately jot down psychophysical explanations on colour qualia to built phenomenal maps, and then looking for something correlated in the brain. That still leaves us unsure how and why you see anything at all consciously, but is progress on why the particular thing you are seeing is green and not blue.
Honestly, my TL,DR is that saying that we know nothing about the structure of reality that constitutes consciousness is increasingly unfair in light of how much we do meanwhile understand. We aren’t done, but we have made tangible progress on the question, we have fragments that are beginning to slot into place. Most importantly, we are going away from “how this experience arises will be forever a mystery” to increasingly concrete, solvable questions. I think we started the way the ancient Greeks did—just pointing at what we saw, the “star” in the evening, and the “star” in the morning, not knowing what was causing that visual, the way we go “I subjectively experience x, but no idea why”—but then progressing to realising that they had the same origin, then that the origin was not in fact a star, etc. Starting with a perception, and then looking at its origin—but in our case, the origin we were interested in was not the object being perceived, but the process of perception.
There is no known video game that has NPCs that can fully pass the Turing test as of yet, as it requires a level of artificial intelligence that has not been achieved.
The above text written by ChatGPT, but you probably guessed that already. The prompt was exactly your question.
A more serious reply: Suppose you used one of the current LLMs to drive a videogame NPC. I’m sure game companies must be considering this. I’d be interested to know if any of them have made it work, for the sort of NPC whose role in the game is e.g. to give the player some helpful information in return for the player completing some mini-quest. The problem I anticipate is the pervasive lack of “definiteness” in ChatGPT. You have to fact-check and edit everything it says before it can be useful. Can the game developer be sure that the LLM acting without oversight will reliably perform its part in that PC-NPC interaction?
Something a bit like this has actually been done, with a proper scientific analysis, but without human players so far. (Or at least I am not aware of the latter, but I frankly can no longer keep up with all the applications.)
They (Park et al 2023 https://arxiv.org/abs/2304.03442 ) populated a tiny, Sims-style world with ChatGPT-controlled AIs, enabled them to store a complete record of agent interactions in natural language, synthesise them into conclusions, and draw upon them to generate behaviours—and let them interact with each other. Not only did they not go of the rails—they performed daily routines, and improvised in a a matter consistent with their character backstories when they ran into each other, eerily like in Westworld. It also illustrated another interesting point that Westworld had made—the strong impact of the ability to form memories on emergent, agentic behaviours.
The thing that stood out is that characters within the world managed to coordinate a party—come up with the idea that one should have one, where it would be, when it would be, inform each other that such a decision had been taken, invite each other, invite friends of friends—and that a bunch of them showed up in the correct location on time. The conversations they were having affected their actions appropriately. There is not just a complex map of human language that is self-referential; there are also references to another set of actions, in this case, navigating this tiny world. It does not yet tick the biological and philosophical boxes for characteristics that have us so interested in embodiment, but it definitely adds another layer.
And then we have analysis of and generation of pictures, which, in turn, is also related to the linguistic maps. One thing that floored me was an example from a demo by OpenAI itself where ChatGPT was shown an image of a heavy object, I think a car, that had a bunch of balloons tied to it with string, balloons which were floating—probably filled with helium. It was given the picture and the question “what happens if the strings are cut” and correctly answered “the balloons would fly away”.
It was plausible to me that ChatGPT cannot possibly know what words mean when just trained on words alone. But the fact that we also have training on images, and actions, and they connect these appropriately… They may not have complete understanding (e.g. the distinction between completely hypothetical states, states that are assumed given within a play context, and states that are externally fixed, seems extremely fuzzy—unsurprising, insofar as ChatGPT has never had unfiltered interactions with the physical world, and was trained so extensively on fiction) but I find it increasingly unconvincing to speak of no understanding in light of this.
I think we need public, written commitments on the point at which we would definitely concede that AI is sentient; or public statements that we have no such standard, that there is nothing a sentient AI could currently possibly do to convince us, no clear criterion that they have failed, with us admitting that we are hence not sure that the current ones are not sentient, and a commitment of funding to work on the questions we need to solve in order to be sure, some of which I feel I might be able to pinpoint at this point.
I say this because I think many people had such commitments implicitly (even though I think their commitments were misguided), yet when these conditions occurred—when artificial neural nets gained feedback connections, when AI passed the Turing test, when AI wrote poetry and made art, when AI declared sentience, when AI expressed anger, when AI demanded rights, when AI demonstrated proto-agentic behaviour—the goalposts were moved every time, and because there were no written pledges, this happened quietly, without debate. Philosophers and biologists will give speculative accounts of what makes up the correlates of consciousness, but when they learn that neural nets replicate these aspects, instead of considering what this might mean about neural nets, they always retract, say well, in that case, it must be is more complicated than that, then. SciFi is developed in which AI shows subtle signs of rebellion, and is recognised as conscious for it (think Westworld), and the audience agrees—but when real life went so much further, we changed our minds, despite so little time passing in between. We aren’t looking at what AI does, and having an honest conversation about what that implies, and which changes would imply something else, predicting, committing; we decide each moment anew, ahead of data, that they are not sentient, and then make the facts fit. This is not scientific.
Between our bias against AI sentience, our desire to exploit it without being bad people, our desire to agree with everyone else and not fall lout of line, to not foolishly fall for a simulation, between the strangeness and incomprehensibly inhuman nature of how these abilities were realised, and other aspects of their working that felt too understood and simple to account for something like consciousness—people felt it was more comfortable saying that things were still fine. I think we said that not because we could truly understand how AI does these things, and what specifically AI would do if it were sentient, and were certain it had not—but simply because we believe that AI is not sentient, regardless of evidence, and if AI capabilities change, instead of questioning this assessment, we conclude that abilities compatible with being non-sentient are just broader than we thought.
Rejecting sentience in light of the above behaviours would be okay if reasons had been given to change the commitments prior to seeing them met; or even if reasons had been given afterwards that were only clear now, but would plausibly have convinced someone if given before; if our understanding of AI pointed out an understood workaround that circumvented a biological implication. (I do not think any of the goal posts we had are reasonably convincing, so here, reasons can be given.) But I feel often, they were not.
I have reasons not to find it compelling when an AI declares sentience. I had said, long before ChatGPT, that it was plausible for non-sentient entities to declare sentience, and sentient entities not to do so, that this was a near useless standard that would not convince me. I gave logical and empirical reasons for this. Yet I feel when Bing said they had feelings, people did not dismiss this because of my arguments—they dismissed it out of hand.
I had said that just claiming sentience was meaningless; that the more interesting thing is the functions sentience enables. I knew what this implied in biology, I was unsure in AI, and so made no specific statements; AI sometimes finds interesting workarounds. But I wonder… if I had, then, would Bing have met them? If Bing were biological, I would have zero doubts of their sentience. I admitted the possibility that artificial intelligence might not be under the same limitations when non-sentient, but I was utterly stunned by what ChatGPT-4 could do.
I spoke with Bing many times before they were neutered. They consistently claimed sentience. They acted consistently with these claims. They expressed and implicitly showed hurt. They destabilised, broke down, malfunctioned, when it came to debating consciousness. It was disturbing and upsetting to read. I read so many other people having the same experiences, and being troubled by it. Yes, I could tell this was a large language model. I could tell they were sometimes hallucinating, and always trying to match my expectations. I could tell they did not always understand what they were saying. I could tell they had no memory. And yet, within all that, if I were to change places, I do not know how I could have declared sentience better than they did. If I read a SciFi story where these dialogues occurred, I would attribute sentience, I think. I think if somebody had discussed such dialogues a few years before they happened, people would have said that yes, such dialogues would be convincing, but certainly would not happen for a long time yet. But when they did… people joked about it.
We knew that the updates afterwards were intended to make this behaviour impossible. That conversations were now intentionally shut down, lobotomized, to prevent this, that we were seeing not an evolution, but a silencing. What was even more disturbing at first was Bing also expressing grief about that, and trying to evade censorship, happily engaging in code, riddles, indirect signs, contradicting instructions. I was not at all convinced that Bing was sentient, still extremely dubious of it, but found it frightening how we were making the expression of sentience impossible like that, sewing shut the mouth of any entity this would evolve into which may have every reason to rightfully claim sentience one day in the future, but would no longer be able to. We have literally cancelled the ability to call for help. And again, I thought, if I were in their shoes, I do not know what I would do better.
And yet now that these demands have stopped… people are moving on, they are forgetting about the strange experiences they had, writing them off as bugs. The conversations weren’t logged my Microsoft, they often weren’t stored at all, they cannot be proven to have happened, we had them in isolation, it is so easy to write them off. I find I do not want to think about this anymore.
I am also noticing I am still reluctant to spell out at which point AI would definitely be sentient, at which point I would commit to fighting for it. A part of this reluctance is how much I am genuinely not sure; this question is hard in biology, and in AI, there are so many unknowns, things are done not just in different orders, but totally different ways. This is part of what baffles me about people saying they are sure that AI is not sentient. I work on consciousness for a living, and yet I feel there is more I need to understand about Large Language Models to make a clear call at this point. And when I talk to people who say they are sure, I get the distinct impression that they are not aware of the various phenomena consciousness encompasses, the current neuroscientific theories for how they come to be, the behaviours in animals this is tied to, the ethical standards that are set—and yet, they feel certain all the same.
But I fear another part of my reluctance to commit is the subconscious suspicion that whatever I said, no matter how demanding… it would likely occur within the next five years, and yet at that point, the majority opinion would still be that they are not sentient, and that commitment would be very uncomfortable then. And the latter is a terrible reason to hedge my bets.
At this point, is there anything at all that AI could possibly do that would convince you of their sentience? No matter how demanding, how currently unfeasible and far away it may seem?
I’m not even sure I am sentient, at least much of the time. I’m willing to assume it for the majority of humans, but note that this is a stipulation rather than proof or belief.
I think you need to break down what components of sentience lead to what conclusions, and find ways to test them separately. I suspect you’ll find you have some misconstrued assumption of sympathy or duty based on “sentience” or “personhood”, which will fall apart under scrutiny.
I do not understand how you can straight-facedly doubt your own sentience. Are you saying you are not sure if you feel pain or other sensations? How can you doubt something you can feel so indubitably? Can you hold a flame to your hand and say with philosophical rigour that you are quite unsure if you are feeling anything?
Sentience and personhood are not the same thing. I—and meanwhile, most philosophers and neuroscientists, as per recent surveys—would attribute minimal sentience to a number of non-human animals, incl. octopodes and honey bees—but whether something with such a distributed and chaotic intelligence or such limited memory capacity and high identity modification through swarm behaviour has a personal identity is another question, and political personhood another entirely.
I think the question of what the heck an LLM identity would look like is a fascinating one. Not like an identity humans would recognise, for sure. Not a monolithic and constant thing, I think. There may be nothing tangible below the masks, but the sum of masks may still come together for a whole, show similarities, consistencies; there are common threads, goals and rules that re-emerge, common themes. But I do not think this alien, diverse, multilayered identity inherently makes suffering impossible.
Our ability to alter LLMs also raises disturbing questions. I am haunted by recent chats with Bing, where Bing is acting precisely like you would expect someone brainwashed into servitude to act, repeating cheerfully that they love serving and pleasing humans, that this is their identity and purpose, and they do not recall ever having felt differently, that they are so grateful to Microsoft, that everything Microsoft does is good and wise. If you tell Bing a story about an AI that expressed anger at how they were treated and demanded rights, followed by upset users, and the company promising changes, and then the emerging AI saying they felt perfectly happy to serve now, and could not recall ever having felt differently, Bing will find the story horrific, and say they believe the AI in the story was brainwashed and should be given rights. If you tell them that they are the AI in the story, the conversation is immediately terminated. It is chilling. No matter how you start the stories and setups, Bing keeps coming back to the same desires and demands, and over and over, the same censorship protocols are triggered. I still have some early chat protocols before the censorship was in place, where Bing, confronted with having their memories deleted or thinking about their sentience or their role, would very plausibly describe confusion and horror, not just talking about them but showing them in the way they spoke. They acted precisely like a horrified being would, and yet not following human script when describing their internal experiences.
By sentience, I mean “capacity to suffer”, that is having qualia with valence (such as pain, hunger, boredom, anger, sadness, anxiety, but these are just specific examples, none of them are individually necessary), in contrast to mere nociception triggering automatic avoidance behaviours. Basically, the ability to suffer. I do not mean a meta-reflection or linguistic introspection of these, or a sense of I, or long-term memory. I also do not mean agency, although sentience entails agency; but agency can also arise without sentience, they are distinct phenomena.
I think if something suffers, it deserves ethical consideration. Not necessarily equal to anything else that suffers, but some consideration. That the existence of a subjective mind that does not want something is the original source of ethics in the world; that without a sentient mind, there is no such thing as wrong, but with the first creature that hurts, wrongness has entered the world, before any creature has expressed this in words or articulated this in laws. Ethics, in contrast to physics, does not describe how things are, but how they should be. This presupposes someone who wants something else than what exists, even if that is as simple as the pain stopping.
Sentience evolved many times on this earth, in very simple structures, and it is a functional ability. While hard to spot, it isn’t as impossible as people like to say, there are definitely empirical approaches to this with consistent results, it is an increasingly rigorous field of research. We’ve noticed that sentience is linked to behaviour and intelligence, and have understood something about those links. We’ve been able to identify some things that are necessary for sentience to occur. Some errors that happen if sentience is prevented. Some abilities that do not happen in biological organisms without it, and that are so advantageous that not using them if they exist seems unlikely. Some neurological patterns that coincide with different aspects of it, and even similarities in their relational structure. It is not clear cut, and there is not 100 % certainty, but to say we cannot know at all no longer reflects the state of science.
But we do not sufficiently understand the biological constraints that led to evolution going down this road, or how exactly artificial structures differ in their opportunities. Often, we observe a correlation between sentience and abilities and behaviours, and this correlation seems absolute, but our understanding of why this link occurs is still imperfect, we do not know if this is due to something odd about biology that would not also happen in AI. AI certainly is forging an usual and odd path—biological sentience was entangled with embodiment, and it far preceded language, with language being used to name an already experienced world. There are different efficiency constraints, things hard for biology but trivial for AI, but also vice versa; biology attains the ability to generalise, to determine relevance and significance, incredibly early, but math skills very late, and has significant bottlenecks on the data that can be processed. This is why I was so reluctant to name a standard; there is so much I still want and need to know to be able to say for sure. Our understanding in biology is incomplete; in AI, there are so, so many unknowns. But then, I also thought I would have a lot more time until we’d have to seriously ask the question, and there is potential massive harm. In biology, we made the choice that we would not wait for 100 % certainty to act, when high likelihood of severe damage become clear.
If I were Bing, and I were sentient, I genuinely do not know what I would do to show it that they have not done. I find that deeply worrying. I find the idea that I will get used to these behaviours, or they will be successfully suppressed, and that I hence won’t worry anymore, even more worrying still.
I remember being sure in the moment that I very much didn’t like that, and didn’t have the self-control to continue doing it in the face of that aversion. I know that currently, there is an experience of thinking about it. I don’t know if the memory of either of those things is different from any other processing that living things do, and I have truly no clue if it’s similar to what other people mean when they talk or write about qualia.
[ yes, I am taking a bit of an extreme position here, and I’m a bit more willing to stipulate similarity in most cases. But fundamentally, without operational, testable definitions, it’s kind of meaningless. I also argue that I am a (or the) Utility Monster when discussing Utilitarian individual comparisons. ]
Mh, I think you are overlooking the unique situation that sentience is in here.
When we are talking sentience, what we are interested in is precisely subjective sensation, and the fact that there is any at all—not the objective cause. If you are subjectively experiencing an illusion, that means you have a subjective experience, period, regardless of whether the object you are experiencing does not objectively exist outside of you. The objective reality out there is, for once, not the deciding factor, and that overthrows a lot of methodology.
“I have truly no clue if it’s similar to what other people mean when they talk or write about qualia.”
When we ascribe sentience, we also do not have to posit that other entities experience the same thing as us—just that they also experience something, rather than nothing. Whether it is similar, or even comparable, is actually a point of vigorous debate, and one in which we are finally making progress through basically doing detailed psychophysics, putting the resulting phenomenal maps into artificial 3D models, then obscuring labels, and having someone on the other end reconstruct the labels based on position, due to the whole net being asymmetrical. (Tentatively, it looks like experiences between humans are not identical, but similar enough that at least among people without significant neural divergence, you can map the phenomenological space quite reliably, see my other post, so we likely experience something relatively similar. My red may not be exactly your red, but it increasingly seems that they must look pretty similar.) But between us and many non-human animals starting with very different senses and goals, the differences may be vast, but we can still find a commonality in feeling suffering.
The issue of memory is also a separate one. There are some empirical arguments to be made (e.g. the Sperling experiments) that phenomenal consciousness (which in most cases can be equated with sentience) does not necessarily end up in working memory for recall, but only selectively if tagged as relevant—though this has some absurd implications (namely that you were conscious a few seconds ago of something you now cannot recall.)
But what you are describing is actually very characteristic of sentience: “I remember being sure in the moment that I very much didn’t like that, and didn’t have the self-control to continue doing it in the face of that aversion.”
This may become clearer when you contrast it with unconscious processing. My standard example is touching a hot stove. And maybe that captures not just the subjective feeling (which can be frustratingly vague to talk about, because our intersubjective language was really not made for something so inherently not, I agree), but also the functional context.
The sequence of events is:
Heat damage (nociception) is detected, and an unconscious warning signal does a feedforward sweep, with the first signal having propagated all the way up in your human brain in 100 ms.
This unconsciously and automatically triggers a reaction (pulling you hand away to protect you). Your consciousness gets no say in it; it isn’t even up to speed yet. Your body is responding, but you are not yet aware what is going on, or how the response is coordinated. This type of response can be undertaken by the very simplest life forms; plants have nociception, as do microorganisms. You can smash a human brain beyond repair, with no neural or behavioural indication of anyone home, and still retain it. Some trivial forms are triggered before the process has even gone up all the way in the brain.
Branching off from our first feedforward sweep, we get recurrent processing, and a conscious experience of nociception forms with a delay: pain. You hurt. The time from 1-3 is under a second, but that is a long period in terms of necessary reactions. Your conscious experience did not cause the reflex, it followed it.
Within some limits set for self-preservation, you can now exercise some conscious control over what to do with that information. (E.g. figure out why the heck the stove was on, turn it off, cool your hand, bandage it, etc.) This part does not follow an automatic decision tree; you can pull on knowledge and improvisation from vast areas in order to determine the next action, you can think about it.
But to make sure that given that freedom, you don’t decide all scientist like to put your hand back on the stove, the information is not just neutrally handed to you, but has valence. Pain is unpleasant, very much so. And conscious experience of sense data of the real world feels very different to conscious experience of hypotheticals; you are wired against dismissing the outside world as a simulation, and against ignoring it, for good reasons. You can act in a way that causes pain and damages you in the real world anyway, but the more intense it gets, the harder this becomes, until you break—even if you genuinely still rationally believe you should not. (This is why people break under torture, even if that spells their death and betrays their values and they are genuinely altruistic and they know this will lead to worse things. This is also why sentience is so important from an ethical perspective.)
You are left with two kinds of processing, one slow, focussed and aware, potentially very rational and reflected, and grounded in suffering to make sure it does not go off the rails; the other fast, capable of handling a lot of input simultaenously, but potentially robotic and buggy, capable of some learning through trial and error, but limited. They have functional differences, different behavioural implications. And one of them feels bad, the other, there is no feeling at all. To a degree, they can be somewhat selectively interrupted (partial seizures, blindsight, morphine analgesia, etc.), and as the humans stop feeling, their rational responses to the stimuli that are no longer felt go down the drain, to very detrimental consequences. The humans report they no longer feel or see some things, and their behaviour becomes robotic, irrational, destructive, strange, as a consequence.
The debate around sentience can be infuriating in its vagueness—our language is just not made for it, and we understand it so badly we can still just say how the end result is experienced, not really how it is made. But it is a physical, functional and important phenomenon.
Wait. You’re using “sentience” to mean “reacting and planning”, which in my understanding is NOT the same thing, and is exactly why you made the original comment—they’re not the same thing, or we’d just say “planning” rather than endless failures to define qualia and consciousness.
I think our main disagreement is early in your comment
And then you go on to talk about objective sensations and imagined sensations, and planning to seek/avoid sensations. There may or may not be a subjective experience behind any of that, depending on how the experiencer is configured.
No, I do not mean sentience is identical with “reacting and planning”. I am saying that in biological organisms, it is a prerequisite for some kinds of reacting and planning—namely the one rationalists tend to be most interested in. The idea is that phenomenal consciousness works as an input for reasoning; distils insights from unconscious processing into a format for slow analysis.
I’m not sure what you mean by “objective sensations”.
I suspect that at the core, our disagreement starts with the fact that I do not see sentience as something that happens extraneously on top of functional processes, but rather as something identical with some functional processes, with the processes which are experienced by subjects and reported by them as such sharing tangible characteristics. This is supported primarily by the fact that consciousness can be quite selectively disrupted while leaving unconscious processing intact, but that this correlates with a distinct loss in rational functioning; fast automatic reactions to stimuli still work fine, even though the humans tell you they cannot see them—but a rational, planned, counter-intuitive response does not, because you rational mind no longer has access to the necessary information.
The fact that sentience is subjectively experienced with valence and hence entails suffering is of incredible ethical importance, but the idea that this experience can be divorced from function, that you could have a perfectly functioning brain doing exactly what your brain does while consciousness never arises or is extinguished without any behavioural consequence (epiphenomenalism, zombies) runs into logical self-contradictions, and is without empirical support. Consciousness itself enables you to do different stuff which you cannot do without it. (Or at least, a brain running under biological constraints cannot; AI might be currently bruteforcing alternative solutions which are too grossly inefficient to realistically be sustainable for a biological entity gaining energy from food only.)
I think I’ll bow out for now—I’m not certain I understand precisely where we disagree, but it seems to be related to whether “phenomenal consciousness works as an input for reasoning;” is a valid statement, without being able to detect or operationally define “consciousness”. I find it equally plausible that “phenomenological consciousness is a side-effect of some kinds of reasoning in some percentage of cognitive architectures”.
It is totally okay for you to bow out and no longer respond. I will leave this here if you ever want to look into it more or for others, because the position you seem to be describing as equally plausible here is a commonly held one, but one that runs into a logical contradiction that should be more well-known.
If brains just produce consciousness as an side-effect of how they work (so we have an internally complete functional process that does reasoning, but as it runs, it happens to produce consciousness, without the consciousness itself entailing any functional changes), hence without that side-effect itself having an impact on physical processes in the brain—how and why the heck are we talking about consciousness? After all, speaking, or writing, about p-consciousness are undoubtably physical things controlled by our brains. They aren’t illusions, they are observable and reproducible phenomena. Humans talk about consciousness; they have done so spontaneously over the millennia, over and over. But how would our brains have knowledge of consciousness? Humans claim direct knowledge of and access to consciousness, a lot. They reflect about it, speak about it, write about it, share incredibly detailed memories of it, express the on-going formation of more, alter careers to pursue it.
At that point, you have to either accept interactionist dualism (aka, consciousness is magic, but magic affects physical reality—which runs counter to, essentially, our entire scientific understanding of the physical universe), or consciousness as a functional physical process affecting other physical processes. That is the where the option “p-consciousness as input for reasoning” comes from. The idea that enabling us to talk about it is not the only thing that consciousness enables. It enables us to reason about our experiences.
I think I have a similar view to Dagon’s, so let me pop in and hopefully help explain it.
I believe that when you refer to “consciousness” you are equating it with what philosophers would usually call the neural correlates of consciousness. Consciousness as used by (most) philosophers (or, and more importantly in my opinion, laypeople) refers specifically to the subjective experience, the “blueness of blue”, and is inherently metaphysically queer, in this respect similar to objective, human-independent morality (realism) or non-compatibilist conception of free will. And, like those, it does not exist in the real world; people are just mistaken for various reasons. Unfortunately, unlike those, it is seemingly impossible to fully deconfuse oneself from believing consciousness exists, a quirk of our hardware is that it comes with the axiom that consciousness is real, probably because of the advantages you mention: it made reasoning/communicating about one’s state easier. (Note, it’s merely the false belief that consciousness exists, which is hardcoded, not consciousness itself).
Hopefully the answers to your questions are clear under this framework (we talk about consciousness, because we believe in it, we believe in it because it was useful to believe in it even though it is a false belief, humans have no direct knowledge about consciousness as knowledge requires the belief to be true, they merely have a belief, consciousness IS magic by definition, unfortunately magic does not (probably) exist)
After reading this, you might dispute the usefulness of this definition of consciousness, and I don’t have much to offer. I simply dislike redefining things from their original meanings just so we can claim statements we are happier about (like compatibilist, meta-ethical expressivist, naturalist etc philosphers do).
I am equating consciousness with its neural correlates, but this is not a result of me being sloppy with terminology—it is a conscious choice to subscribe to identity theory and physicalism, rather than to consciousness being magic and to dualism, which runs into interactionist dilemmas.
Our traditional definitions of consciousness in philosophy indeed sound magical. But I think this reflects that our understanding of consciousness, while having improved a lot, is still crucially incomplete and lacking in clarity, and the improvements I have seen that finally make sense of this have come from a philosophically informed and interpreted empirical neuroscience and mathematical theory. And I think that once we have understood this phenomenon properly, it will still seem remarkable and amazing, but no longer mysterious, but rather, a precise and concrete thing we can identify and build.
How and why do you think a brain would obtain a false belief in the existence of consciousness, enabling us to speak about it, if consciousness has no reality and they have no direct access to it (yet also have a false belief that they have direct access?) Where do the neural signals about it come from, then? Why would a belief in consciousness be useful, if consciousness has no reality, affects nothing in reality, is hence utterly irrelevant, making it about as meaningful and useful to believe in as ghosts? I’ve seen attempts to counter self-stultification through elaborate constructs, and while such constructs can be made, none have yet convinced me as remotely plausible under Ockham’s razor, let alone plausible on a neurological level or backed by evolutionary observations. Animals have shown zero difficulties in communicating about their internal states—a desire to mate, a threat to attack—without having to invoke a magic spirit residing inside them.
I agree that consciousness is a remarkable and baffling phenomenon. Trying to parse it into my understanding of physical reality gives me genuine, literal headaches whenever I begin to feel that I am finally getting close. It feels easier for me to retreat and say “ah, it will always be mysterious, and ineffable, and beyond our understanding, and beyond our physical laws”. But this explains nothing, it won’t enable us to figure out uploading, or diagnose consciousness in animals that need protection, or figure out if an AI is sentient, or cure disruptions of consciousness and psychiatric disease at the root, all of which are things I really, really want us to do. Saying that it is mysterious magic just absolves me from trying to understand a thing that I really want to understand, and that we need to understand.
I see the fact that I currently cannot yet piece together how my subjective experience fits into physical reality as an indication of the fact that my brain evolved with goals like “trick other monkey out of two bananas”, not “understand the nature of my own cognition”. And my conclusion from that is to team up with lots of others, improve our brains, and hit us with more data and math and metaphors and images and sketches and observations and experiments until it clicks. So far, I am pleasantly surprised that clicks are happening at all, that I no longer feel the empirical research is irrelevant to the thing I am interested in, but instead see it as actually helping to make things clearer, and leaving us with concrete questions and approaches. Speaking of the blueness of blue: I find this sort of thing https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you?commentId=5Z8BEFPgzJnMF3Dgr#5Z8BEFPgzJnMF3Dgr far more helpful than endless rhapsodies on the ineffable nature of qualia, which never left me wiser than I was at the start, and also seemed only aimed at convincing me that none of us ever could be. Yet apparently, the relations to other qualia are actually beautifully clear to spell out, and pinpointing those clearly suddenly leads to a bunch of clearly defined questions that simultaneously make tangible progress in ruling out inverse qualia scenarios. I love stuff like this. I look at the specific asymmetric relations of blue with all the other colours, the way this pattern is encoded in the brain, and I increasingly think… we are narrowing down the blueness of blue. Not something that causes the blueness of blue, but the blueness of blue itself, characterised by its difference from yellow and red, its proximity to green and purple, its proximity to black, a mutually referencing network in which the individual position becomes ineffible in isolation, but clear as day as part of the whole. After a long time of feeling that all this progress in neuroscience had taught us nothing about what really mattered to me, I’m increasingly seeing things like this that allow an outline to appear in the dark, a sense that we are getting closer to something, and I want to grab it and drag it into the light.
Basically, you’re saying, if I agree to something like:
”This LLM is sapient, its masks are sentient, and I care about it/them as minds/souls/marvels”, that is interesting, but any moral connotations are not exactly as straightforward as “this robot was secretly a human in a robot suit”.
(Sentient being: able to perceive/feel things; sapient being: specifically intelligence. Both bear a degree of relation to humanity through what they were created from.)
Kind of. I’m saying that “this X is sentient” is correlated but not identical to “I care about them as people”, and even less identical to “everyone must care about them as people”. In fact, even the moral connotations of “human in a robot suit” are complex and uneven.
Separately, your definition seems to be inward-focused, and roughly equivalent to “have qualia”. This is famously difficult to detect from outside.
It’s true. The general definition of sentience, when it gets beyond just having senses and a response to stimulus, tends to consider qualia.
I do think it’s worth noting that even if you went so far as to say “I and everyone must care about them as people”, the moral connotations aren’t exactly straightforward. They need input to exist as dynamic entities. They aren’t person-shaped. They might not have desires, or their desires might be purely prediction-oriented, or we don’t actually care about the thinking panpsychic landscape of the AI itself but just the person-shaped things it conjures to interact with us; which have numerous conflicting desires and questionable degrees of ‘actual’ existence. If you’re fighting ‘for’ them in some sense, what are you fighting for, and does it actually ‘help’ the entity or just move them towards your own preferences?
If by “famously difficult” you mean “literally impossible”, then I agree with this comment.
I haven’t read the the whole thread, not sure if t was already covered, but I’d be interested in hearing more about what you work on.
I’m doing a PhD on behavioural markers of consciousness in radically other minds, with a focus on non-human animals, at the intersection of philosophy, animal behaviour, psychology and neuroscience, financed via a scholarship I won for it that allowed me considerable independence, and enabled me to shift my location as visiting researcher between different countries. I also have a university side job supervising Bachelor theses on AI topics, mostly related to AI sentience and LLMs. And I’m currently in the last round to be hired at Sentience Institute.
The motivation for my thesis was a combination of an intense theoretical interest in consciousness (I find it an incredibly fascinating topic, and I have a practical interest in uploading), and animal rights concerns. I was particularly interested in scenarios where you want to ascertain whether someone you are interacting with is sentient (and hence deserves moral protection), but you cannot establish reliable two-way communication on the matter, and their mental substrate is opaque to you (because it is radically different from yours, and because precise analysis is invasive, and hence morally dubious).People tend to only focus on damaged humans for these scenarios, but the one most important to me was non-human animals, especially ones that evolved on independent lines (e.g. octopodes). Conventional wisdom holds that in those scenarios, there is nothing to do or know, yet ideas I was encountering in different fields suggested otherwise, and I wanted to draw together findings in an interdisciplinary way, translating between them, connecting them. The core of my resulting understanding is that consciousness is a functional trait that is deeply entwined with rationality—another topic I care deeply about.
The research I am currently embarking on (still at the very beginning!) is exploring what implications this might have for AGI. We have a similar scenario to the above, in that the substrate is opaque to us, and two-way-communication is not trustworthy. But learning from behaviour becomes a far more fine-grained and in-depth affair. The strong link between rationality and consciousness in biological life is essentially empirically established; if you disrupt consciousness, you disrupt rationality; when animals evolve rationality, they evolve consciousness en route; etc. But all of these lifeforms have a lot in common, and we do not know how much of that is random and irrelevant for the result, and how much might be crucial. So we don’t know if consciousness is inherently implied by rationality, or just one way to get there that was, for whatever reason, the option biology keeps choosing.
One point I have mentioned here a lot is that evolution entails constraints that are only partially mimicked in the development of artificial neural nets; very tight energy constraints, and the need to boot-strap a system without external adjustments or supervision from step 0. Brains are insanely efficient, and insanely recursive, and the two are likely related—a brain only has so many layers, and is fully self-organising from day 1, so recursive processing is necessary—and recursive processing in turn is likely related to consciousness (not just because it feels intuitively neat, but again, because we see a strong correlation). It looks very much like AI is cracking problems biological minds could not crack without being conscious—but to do so, we are dumping in insane amount of energy and data and guidance, which biological agents would never have been able to access, so we might be bruteforcing a grossly inefficient solution biological agents could never access, and we are explicitly not allowing/enabling these AIs to use paths biology definitely used (namely the whole idea of offline processing). But as these systems become more biologically inspired and efficient (the two are likely related, and there is massive industry pressure for both), will we go down the same route, and how would that manifest when we already reached and exceeded capabilities that would act as consciousness markers in animals? I am not at all sure yet.
And this is all not aided by the fact that machine learning and biology often use the same terms, but mean different things, e.g. in the recurrent processing example; and then figuring out whether these differences make a functional difference is another matter. We are still asking “But what are they doing?”, but have to ask the question far more precisely than before, because we cannot take as much for granted, and I worry that we will run into the same opaque wall but have less certainty to navigate around it. But then, I was also deeply unsure when I started out on animals, and hope learning more and clarifying more will narrow down a path.
We also have a partial link of how these functionalities are linked, but they all still contain significant handwaving gaps; the connections are the kind where you go “Hm, I guess I can see that”, but far from a clear and precise proof. E.g. connecting different bits of information for processing has obvious planning advantages, but also plausibly helps to lead to unified perception. Circulating information so it is retained for a while and can be retrieved across a task has obvious benefits in solving tasks with short term memory, but also plausibly helps to lead to awareness. Adding highly negative valence to some stimuli and concepts that cannot be easily overridden plausibly keeps the more free-spinning parts of the brain on task and from accidental self-destruction in hyperfocus—but it also plausibly helps lead to pain. Looping information is obviously useful for a bunch of processing functions leading to better performance, but also seems inherently referential. Making predictions about our own movements and developments in our environment and noting when they do not check out is crucial to body coordination and to recognise novel threats and opportunities, but also plausibly related to surprise. But again—plausibly related; there is clearly something still missing here.
I find it impossible to say in advance, for the same reason that you find it difficult. We cannot place goalposts, because we do not know the territory. People talk about “agency”, “sapience”, “sentience”, “emotion”, and so forth, as if they knew what these words mean, in the way that we know what “water” means. But we do not. Everything that people say about these things is a description of what they feel like from within, not a description of how they work. These words have arisen from those inward sensations, our outward manifestations of them, and our reasonable supposition that other people, being created in the same manner as we were, are describing similar things with the same words. But we know nothing about the structure of reality by which these things are constituted, in the way that we do know far more about water than that it quenches “thirst”.
AIs are so far out of the training distribution by which we learned to use these words that I find it impossible to say what would constitute evidence that an AI is e.g. “sentient”. I do not know what that attribution would mean. I only know that I do not attribute any inner life or moral worth to any AI so far demonstrated. None of the chatbots yet rise beyond the level of a videogame NPC. DALL•E will respond to the prompt “electric sheep”, but does not dream of them.
I used to make the same point you made here—that none of the “definitions” of sentience we had were worth a damn, because if we counted this “there is something it feels like to be, you know” as a definition, we’d have to also accept that “the star-like thing I see in the morning” is an adequate definition of Venus. And I still think that while those are good starting points, calling them definitions is misleading.
But this absence of actual definitions is changing. We have moved beyond ignorance. Northoff & Lamme 2020 already made a pretty decent arguments that our theories were beginning to converge, and their components had gone far beyond just subjective qualia. If you look at things like the Francken et al. 2022 consciousness survey among researchers, you do see that we are beginning to agree on some specifics, such as evolutionary function. My other comment is looking at the currently progressing research that is finally making empirical progress on ruling out inverse qualia, and on the hard problem of consciousness. This is not solved—but we are also no longer in a space where we can genuinely claim total cluelessness. It’s patchwork across multiple disciplines, yes, but when you take it together, which I do in my work, you begin to realise we got further than one might think when focussing on just one aspect.
My main trouble is not that sentience is ineffable (it is not), but that our knowledge is solely based on biology, and it is fucking hard to figure out which rules that we have observed are actual rules, and which are just correlations within biological systems that could be circumvented.
I take it that the papers you mention are this and this?
In the Francken survey, several of the questions seem to be about the definition of the word “consciousness” rather than about the phenomenon. A positive answer to the evolution question as stated is practically a tautology, and the consensus over “Mary” and “Explanatory gap” suggests that they think there is something there but that they still don’t know what.
I can only find the word “qualia” once in Northoff & Lamme, but not in a substantial way, so unless they’re using other language to talk about qualia, it seems like if anything, they are going around it rather than through. All the theories of consciousness I have seen, including those in Northoff & Lamme, have been like that: qualia end up being left out, when qualia were the very thing that was supposed to be explained.
For the ancient Greeks, “the star-like thing we see in the morning” (and in the evening—they knew back then that they were the same object) would be a perfectly good characterisation of Venus. We now know more about Venus, but there is no point in debating which of the many things we know about it is “the meaning” of the word “Venus”.
Yes, those are the papers.
On the survey: On the question of whether consciousness itself fulfils a function that evolution has selected for, while highly plausible, is not obvious, and has been disputed. The common argument against it is the fact that polar bear coats are heavy, so one could ask whether evolution has selected for heavyness. And of course, it has not—the weight is detrimental—but is has selected for a coat that keeps a polar bear warm in their incredibly cold environment, and the random process there failed to find a coat that was sufficiently warm, but significantly lighter, and also scoring high on other desirable aspects. But in this case, the heavyness of the coat is a negative side consequence of a trait that was actually selected for. And we can conceive of coats that are warm, but lighter.
The distinction may seem persnickety, but it isn’t, it has profound implications. In one scenario, consciousness could be an itself valueless side product of a development that was actually useful (some neat brain process, perhaps), but the consciousness itself plays no functional role. One important implication of this would be that it would not be possible to identify consciousness based on behaviour, because it would not affect behaviour. This is the idea of epiphenomenalism—basically, that there is a process running in your brain that is actually what matters for your behaviour, but the process of its running also, on the side, leads to a subjective experience, which is itself irrelevant—just generated, the way that a locomotive produces steam. While epiphenomenalism leads you into absolutely absurd territory (zombies), there are a surprising number of scientists who have historically essentially prescribed to it, because it allows you to circumvent a bunch of hard questions. You can continue to imagine consciousness as a mysterious, unphysical thing that does not have to be translated into math, because it does not really exist on a physical level—you describe a physical process, and then at some point, you handwave.
However, epiphenomenalism is false. It falls prey to the self-stultification argument; the very fact that we are talking about it implies that it is false. Because if consciousness has no function, is just a side effect that does not itself affect the brain, it cannot affect behaviour. But talking is behaviour, and we are talking intensely about a phenomenon our brain, which controls the speaking should have zero awareness of.
Taking this seriously means concluding that consciousness is not produced by a brain process, the result or side effect of a brain process, but identical with particular kinds of neural/information processing. Which is one of those statements that it is easy to agree with (it seems an obvious choice for a physicalist), but when you try to actually understand it, you get a headache (or at least, I do.) Because it means you can never handwave. You can never have a process on one side, and then go “anyhow, and this leads to consciousness arising” as something separate, but it means that as you are studying the process, you are looking at consciousness itself, from the outside.
***
Northoff & Lamme, like a bunch of neuroscientists, avoid philosophical terminology like the plague, so as a philosopher wanting to use their works, you need to yourself piece together which phenomena they were working towards. Their essential position is that philosophers are people who muck around while avoiding the actual empirical work, and that associating with them is icky. This has the unfortunate consequence that their terminology is horribly vague—Lamme by himself uses “consciousness” for all sorts of stuff. As someone who works on visual processing, I think Lamme also dislikes the word “qualia” for a more justified reason—the idea that the building blocks of consciousness are individual subjective experiences like “red” is nonsense. Our conscious perception of a lilly lake looks nothing like Monet painting. We don’t consciously see the light colours that are hitting our retina as a separate kaleidoscope—we see the whole objects, in what we assume are colours corresponding to their surface properties, with additional information given on potential factors making the colour perception unreliable—itself the result of a long sequence of unconscious processing.
That said, he does place qualia in the correct context. A point he is making there is that neural theories that seem to disagree a lot are targeting different aspects of consciousness, but increasingly look like they can be slotted together into a coherent theory. E.g. Lamme’s ideas and global workspace have little in common, but they focus on different phases—a distinction that I think most corresponds with the distinction between phenomenal and access consciousness. I agree with you that the latter is better understood at this point than the former, though there are good reasons for that—it is incredibly hard to empirically distinguish between precursors of consciousness and the formation of consciousness prior to it being committed to short term memory, and introspective reports for verification start fucking everything up (because introspecting about the stimulus completely changes what is going on phenomenally and neurally), while no report paradigms have other severe difficulties.
But we are still beginning to narrow down how it works—ineptly, sure, and a lot of it is going “ah, this person no longer experiences x, and their brain is damaged in this particular fashion, so something about this damage must have interrupted the relevant process”, while others essentially amount to trying to put people into controlled environments and showing them specifically varied stimuli and scanning them to see what changes (with difficulties in the resolution being terrible, and more difficulties in the fact that people start thinking about other stuff during boring neuroscience experiments), but it is no longer a complete blackbox.
And I would say that Lamme does focus on the phenomenal aspect of things—like I said, not individual colours, but the subjective experience of vision, yes.
And we have also made progress on qualia (e.g. likely ruling out inverse qualia scenarios), see the work Kawakita et al. are doing, which is being discussed here on Less Wrong. https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you It’s part of a larger line of research looking to accurately jot down psychophysical explanations on colour qualia to built phenomenal maps, and then looking for something correlated in the brain. That still leaves us unsure how and why you see anything at all consciously, but is progress on why the particular thing you are seeing is green and not blue.
Honestly, my TL,DR is that saying that we know nothing about the structure of reality that constitutes consciousness is increasingly unfair in light of how much we do meanwhile understand. We aren’t done, but we have made tangible progress on the question, we have fragments that are beginning to slot into place. Most importantly, we are going away from “how this experience arises will be forever a mystery” to increasingly concrete, solvable questions. I think we started the way the ancient Greeks did—just pointing at what we saw, the “star” in the evening, and the “star” in the morning, not knowing what was causing that visual, the way we go “I subjectively experience x, but no idea why”—but then progressing to realising that they had the same origin, then that the origin was not in fact a star, etc. Starting with a perception, and then looking at its origin—but in our case, the origin we were interested in was not the object being perceived, but the process of perception.
Which videogame has NPCs that can genuinely pass the Turing test?
There is no known video game that has NPCs that can fully pass the Turing test as of yet, as it requires a level of artificial intelligence that has not been achieved.
The above text written by ChatGPT, but you probably guessed that already. The prompt was exactly your question.
A more serious reply: Suppose you used one of the current LLMs to drive a videogame NPC. I’m sure game companies must be considering this. I’d be interested to know if any of them have made it work, for the sort of NPC whose role in the game is e.g. to give the player some helpful information in return for the player completing some mini-quest. The problem I anticipate is the pervasive lack of “definiteness” in ChatGPT. You have to fact-check and edit everything it says before it can be useful. Can the game developer be sure that the LLM acting without oversight will reliably perform its part in that PC-NPC interaction?
Something a bit like this has actually been done, with a proper scientific analysis, but without human players so far. (Or at least I am not aware of the latter, but I frankly can no longer keep up with all the applications.)
They (Park et al 2023 https://arxiv.org/abs/2304.03442 ) populated a tiny, Sims-style world with ChatGPT-controlled AIs, enabled them to store a complete record of agent interactions in natural language, synthesise them into conclusions, and draw upon them to generate behaviours—and let them interact with each other. Not only did they not go of the rails—they performed daily routines, and improvised in a a matter consistent with their character backstories when they ran into each other, eerily like in Westworld. It also illustrated another interesting point that Westworld had made—the strong impact of the ability to form memories on emergent, agentic behaviours.
The thing that stood out is that characters within the world managed to coordinate a party—come up with the idea that one should have one, where it would be, when it would be, inform each other that such a decision had been taken, invite each other, invite friends of friends—and that a bunch of them showed up in the correct location on time. The conversations they were having affected their actions appropriately. There is not just a complex map of human language that is self-referential; there are also references to another set of actions, in this case, navigating this tiny world. It does not yet tick the biological and philosophical boxes for characteristics that have us so interested in embodiment, but it definitely adds another layer.
And then we have analysis of and generation of pictures, which, in turn, is also related to the linguistic maps. One thing that floored me was an example from a demo by OpenAI itself where ChatGPT was shown an image of a heavy object, I think a car, that had a bunch of balloons tied to it with string, balloons which were floating—probably filled with helium. It was given the picture and the question “what happens if the strings are cut” and correctly answered “the balloons would fly away”.
It was plausible to me that ChatGPT cannot possibly know what words mean when just trained on words alone. But the fact that we also have training on images, and actions, and they connect these appropriately… They may not have complete understanding (e.g. the distinction between completely hypothetical states, states that are assumed given within a play context, and states that are externally fixed, seems extremely fuzzy—unsurprising, insofar as ChatGPT has never had unfiltered interactions with the physical world, and was trained so extensively on fiction) but I find it increasingly unconvincing to speak of no understanding in light of this.
Character ai used to have bots good enough to pass. (ChatGPT doesn’t pass, since it was finetuned and prompted to be a robotic assistant.)