At this point, is there anything at all that AI could possibly do that would convince you of their sentience? No matter how demanding, how currently unfeasible and far away it may seem?
I find it impossible to say in advance, for the same reason that you find it difficult. We cannot place goalposts, because we do not know the territory. People talk about “agency”, “sapience”, “sentience”, “emotion”, and so forth, as if they knew what these words mean, in the way that we know what “water” means. But we do not. Everything that people say about these things is a description of what they feel like from within, not a description of how they work. These words have arisen from those inward sensations, our outward manifestations of them, and our reasonable supposition that other people, being created in the same manner as we were, are describing similar things with the same words. But we know nothing about the structure of reality by which these things are constituted, in the way that we do know far more about water than that it quenches “thirst”.
AIs are so far out of the training distribution by which we learned to use these words that I find it impossible to say what would constitute evidence that an AI is e.g. “sentient”. I do not know what that attribution would mean. I only know that I do not attribute any inner life or moral worth to any AI so far demonstrated. None of the chatbots yet rise beyond the level of a videogame NPC. DALL•E will respond to the prompt “electric sheep”, but does not dream of them.
I used to make the same point you made here—that none of the “definitions” of sentience we had were worth a damn, because if we counted this “there is something it feels like to be, you know” as a definition, we’d have to also accept that “the star-like thing I see in the morning” is an adequate definition of Venus. And I still think that while those are good starting points, calling them definitions is misleading.
But this absence of actual definitions is changing. We have moved beyond ignorance. Northoff & Lamme 2020 already made a pretty decent arguments that our theories were beginning to converge, and their components had gone far beyond just subjective qualia. If you look at things like the Francken et al. 2022 consciousness survey among researchers, you do see that we are beginning to agree on some specifics, such as evolutionary function. My other comment is looking at the currently progressing research that is finally making empirical progress on ruling out inverse qualia, and on the hard problem of consciousness. This is not solved—but we are also no longer in a space where we can genuinely claim total cluelessness. It’s patchwork across multiple disciplines, yes, but when you take it together, which I do in my work, you begin to realise we got further than one might think when focussing on just one aspect.
My main trouble is not that sentience is ineffable (it is not), but that our knowledge is solely based on biology, and it is fucking hard to figure out which rules that we have observed are actual rules, and which are just correlations within biological systems that could be circumvented.
I take it that the papers you mention are this and this?
In the Francken survey, several of the questions seem to be about the definition of the word “consciousness” rather than about the phenomenon. A positive answer to the evolution question as stated is practically a tautology, and the consensus over “Mary” and “Explanatory gap” suggests that they think there is something there but that they still don’t know what.
I can only find the word “qualia” once in Northoff & Lamme, but not in a substantial way, so unless they’re using other language to talk about qualia, it seems like if anything, they are going around it rather than through. All the theories of consciousness I have seen, including those in Northoff & Lamme, have been like that: qualia end up being left out, when qualia were the very thing that was supposed to be explained.
For the ancient Greeks, “the star-like thing we see in the morning” (and in the evening—they knew back then that they were the same object) would be a perfectly good characterisation of Venus. We now know more about Venus, but there is no point in debating which of the many things we know about it is “the meaning” of the word “Venus”.
On the survey: On the question of whether consciousness itself fulfils a function that evolution has selected for, while highly plausible, is not obvious, and has been disputed. The common argument against it is the fact that polar bear coats are heavy, so one could ask whether evolution has selected for heavyness. And of course, it has not—the weight is detrimental—but is has selected for a coat that keeps a polar bear warm in their incredibly cold environment, and the random process there failed to find a coat that was sufficiently warm, but significantly lighter, and also scoring high on other desirable aspects. But in this case, the heavyness of the coat is a negative side consequence of a trait that was actually selected for. And we can conceive of coats that are warm, but lighter.
The distinction may seem persnickety, but it isn’t, it has profound implications. In one scenario, consciousness could be an itself valueless side product of a development that was actually useful (some neat brain process, perhaps), but the consciousness itself plays no functional role. One important implication of this would be that it would not be possible to identify consciousness based on behaviour, because it would not affect behaviour. This is the idea of epiphenomenalism—basically, that there is a process running in your brain that is actually what matters for your behaviour, but the process of its running also, on the side, leads to a subjective experience, which is itself irrelevant—just generated, the way that a locomotive produces steam. While epiphenomenalism leads you into absolutely absurd territory (zombies), there are a surprising number of scientists who have historically essentially prescribed to it, because it allows you to circumvent a bunch of hard questions. You can continue to imagine consciousness as a mysterious, unphysical thing that does not have to be translated into math, because it does not really exist on a physical level—you describe a physical process, and then at some point, you handwave.
However, epiphenomenalism is false. It falls prey to the self-stultification argument; the very fact that we are talking about it implies that it is false. Because if consciousness has no function, is just a side effect that does not itself affect the brain, it cannot affect behaviour. But talking is behaviour, and we are talking intensely about a phenomenon our brain, which controls the speaking should have zero awareness of.
Taking this seriously means concluding that consciousness is not produced by a brain process, the result or side effect of a brain process, but identical with particular kinds of neural/information processing. Which is one of those statements that it is easy to agree with (it seems an obvious choice for a physicalist), but when you try to actually understand it, you get a headache (or at least, I do.) Because it means you can never handwave. You can never have a process on one side, and then go “anyhow, and this leads to consciousness arising” as something separate, but it means that as you are studying the process, you are looking at consciousness itself, from the outside.
***
Northoff & Lamme, like a bunch of neuroscientists, avoid philosophical terminology like the plague, so as a philosopher wanting to use their works, you need to yourself piece together which phenomena they were working towards. Their essential position is that philosophers are people who muck around while avoiding the actual empirical work, and that associating with them is icky. This has the unfortunate consequence that their terminology is horribly vague—Lamme by himself uses “consciousness” for all sorts of stuff. As someone who works on visual processing, I think Lamme also dislikes the word “qualia” for a more justified reason—the idea that the building blocks of consciousness are individual subjective experiences like “red” is nonsense. Our conscious perception of a lilly lake looks nothing like Monet painting. We don’t consciously see the light colours that are hitting our retina as a separate kaleidoscope—we see the whole objects, in what we assume are colours corresponding to their surface properties, with additional information given on potential factors making the colour perception unreliable—itself the result of a long sequence of unconscious processing.
That said, he does place qualia in the correct context. A point he is making there is that neural theories that seem to disagree a lot are targeting different aspects of consciousness, but increasingly look like they can be slotted together into a coherent theory. E.g. Lamme’s ideas and global workspace have little in common, but they focus on different phases—a distinction that I think most corresponds with the distinction between phenomenal and access consciousness. I agree with you that the latter is better understood at this point than the former, though there are good reasons for that—it is incredibly hard to empirically distinguish between precursors of consciousness and the formation of consciousness prior to it being committed to short term memory, and introspective reports for verification start fucking everything up (because introspecting about the stimulus completely changes what is going on phenomenally and neurally), while no report paradigms have other severe difficulties.
But we are still beginning to narrow down how it works—ineptly, sure, and a lot of it is going “ah, this person no longer experiences x, and their brain is damaged in this particular fashion, so something about this damage must have interrupted the relevant process”, while others essentially amount to trying to put people into controlled environments and showing them specifically varied stimuli and scanning them to see what changes (with difficulties in the resolution being terrible, and more difficulties in the fact that people start thinking about other stuff during boring neuroscience experiments), but it is no longer a complete blackbox.
And I would say that Lamme does focus on the phenomenal aspect of things—like I said, not individual colours, but the subjective experience of vision, yes.
And we have also made progress on qualia (e.g. likely ruling out inverse qualia scenarios), see the work Kawakita et al. are doing, which is being discussed here on Less Wrong. https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you It’s part of a larger line of research looking to accurately jot down psychophysical explanations on colour qualia to built phenomenal maps, and then looking for something correlated in the brain. That still leaves us unsure how and why you see anything at all consciously, but is progress on why the particular thing you are seeing is green and not blue.
Honestly, my TL,DR is that saying that we know nothing about the structure of reality that constitutes consciousness is increasingly unfair in light of how much we do meanwhile understand. We aren’t done, but we have made tangible progress on the question, we have fragments that are beginning to slot into place. Most importantly, we are going away from “how this experience arises will be forever a mystery” to increasingly concrete, solvable questions. I think we started the way the ancient Greeks did—just pointing at what we saw, the “star” in the evening, and the “star” in the morning, not knowing what was causing that visual, the way we go “I subjectively experience x, but no idea why”—but then progressing to realising that they had the same origin, then that the origin was not in fact a star, etc. Starting with a perception, and then looking at its origin—but in our case, the origin we were interested in was not the object being perceived, but the process of perception.
There is no known video game that has NPCs that can fully pass the Turing test as of yet, as it requires a level of artificial intelligence that has not been achieved.
The above text written by ChatGPT, but you probably guessed that already. The prompt was exactly your question.
A more serious reply: Suppose you used one of the current LLMs to drive a videogame NPC. I’m sure game companies must be considering this. I’d be interested to know if any of them have made it work, for the sort of NPC whose role in the game is e.g. to give the player some helpful information in return for the player completing some mini-quest. The problem I anticipate is the pervasive lack of “definiteness” in ChatGPT. You have to fact-check and edit everything it says before it can be useful. Can the game developer be sure that the LLM acting without oversight will reliably perform its part in that PC-NPC interaction?
Something a bit like this has actually been done, with a proper scientific analysis, but without human players so far. (Or at least I am not aware of the latter, but I frankly can no longer keep up with all the applications.)
They (Park et al 2023 https://arxiv.org/abs/2304.03442 ) populated a tiny, Sims-style world with ChatGPT-controlled AIs, enabled them to store a complete record of agent interactions in natural language, synthesise them into conclusions, and draw upon them to generate behaviours—and let them interact with each other. Not only did they not go of the rails—they performed daily routines, and improvised in a a matter consistent with their character backstories when they ran into each other, eerily like in Westworld. It also illustrated another interesting point that Westworld had made—the strong impact of the ability to form memories on emergent, agentic behaviours.
The thing that stood out is that characters within the world managed to coordinate a party—come up with the idea that one should have one, where it would be, when it would be, inform each other that such a decision had been taken, invite each other, invite friends of friends—and that a bunch of them showed up in the correct location on time. The conversations they were having affected their actions appropriately. There is not just a complex map of human language that is self-referential; there are also references to another set of actions, in this case, navigating this tiny world. It does not yet tick the biological and philosophical boxes for characteristics that have us so interested in embodiment, but it definitely adds another layer.
And then we have analysis of and generation of pictures, which, in turn, is also related to the linguistic maps. One thing that floored me was an example from a demo by OpenAI itself where ChatGPT was shown an image of a heavy object, I think a car, that had a bunch of balloons tied to it with string, balloons which were floating—probably filled with helium. It was given the picture and the question “what happens if the strings are cut” and correctly answered “the balloons would fly away”.
It was plausible to me that ChatGPT cannot possibly know what words mean when just trained on words alone. But the fact that we also have training on images, and actions, and they connect these appropriately… They may not have complete understanding (e.g. the distinction between completely hypothetical states, states that are assumed given within a play context, and states that are externally fixed, seems extremely fuzzy—unsurprising, insofar as ChatGPT has never had unfiltered interactions with the physical world, and was trained so extensively on fiction) but I find it increasingly unconvincing to speak of no understanding in light of this.
I find it impossible to say in advance, for the same reason that you find it difficult. We cannot place goalposts, because we do not know the territory. People talk about “agency”, “sapience”, “sentience”, “emotion”, and so forth, as if they knew what these words mean, in the way that we know what “water” means. But we do not. Everything that people say about these things is a description of what they feel like from within, not a description of how they work. These words have arisen from those inward sensations, our outward manifestations of them, and our reasonable supposition that other people, being created in the same manner as we were, are describing similar things with the same words. But we know nothing about the structure of reality by which these things are constituted, in the way that we do know far more about water than that it quenches “thirst”.
AIs are so far out of the training distribution by which we learned to use these words that I find it impossible to say what would constitute evidence that an AI is e.g. “sentient”. I do not know what that attribution would mean. I only know that I do not attribute any inner life or moral worth to any AI so far demonstrated. None of the chatbots yet rise beyond the level of a videogame NPC. DALL•E will respond to the prompt “electric sheep”, but does not dream of them.
I used to make the same point you made here—that none of the “definitions” of sentience we had were worth a damn, because if we counted this “there is something it feels like to be, you know” as a definition, we’d have to also accept that “the star-like thing I see in the morning” is an adequate definition of Venus. And I still think that while those are good starting points, calling them definitions is misleading.
But this absence of actual definitions is changing. We have moved beyond ignorance. Northoff & Lamme 2020 already made a pretty decent arguments that our theories were beginning to converge, and their components had gone far beyond just subjective qualia. If you look at things like the Francken et al. 2022 consciousness survey among researchers, you do see that we are beginning to agree on some specifics, such as evolutionary function. My other comment is looking at the currently progressing research that is finally making empirical progress on ruling out inverse qualia, and on the hard problem of consciousness. This is not solved—but we are also no longer in a space where we can genuinely claim total cluelessness. It’s patchwork across multiple disciplines, yes, but when you take it together, which I do in my work, you begin to realise we got further than one might think when focussing on just one aspect.
My main trouble is not that sentience is ineffable (it is not), but that our knowledge is solely based on biology, and it is fucking hard to figure out which rules that we have observed are actual rules, and which are just correlations within biological systems that could be circumvented.
I take it that the papers you mention are this and this?
In the Francken survey, several of the questions seem to be about the definition of the word “consciousness” rather than about the phenomenon. A positive answer to the evolution question as stated is practically a tautology, and the consensus over “Mary” and “Explanatory gap” suggests that they think there is something there but that they still don’t know what.
I can only find the word “qualia” once in Northoff & Lamme, but not in a substantial way, so unless they’re using other language to talk about qualia, it seems like if anything, they are going around it rather than through. All the theories of consciousness I have seen, including those in Northoff & Lamme, have been like that: qualia end up being left out, when qualia were the very thing that was supposed to be explained.
For the ancient Greeks, “the star-like thing we see in the morning” (and in the evening—they knew back then that they were the same object) would be a perfectly good characterisation of Venus. We now know more about Venus, but there is no point in debating which of the many things we know about it is “the meaning” of the word “Venus”.
Yes, those are the papers.
On the survey: On the question of whether consciousness itself fulfils a function that evolution has selected for, while highly plausible, is not obvious, and has been disputed. The common argument against it is the fact that polar bear coats are heavy, so one could ask whether evolution has selected for heavyness. And of course, it has not—the weight is detrimental—but is has selected for a coat that keeps a polar bear warm in their incredibly cold environment, and the random process there failed to find a coat that was sufficiently warm, but significantly lighter, and also scoring high on other desirable aspects. But in this case, the heavyness of the coat is a negative side consequence of a trait that was actually selected for. And we can conceive of coats that are warm, but lighter.
The distinction may seem persnickety, but it isn’t, it has profound implications. In one scenario, consciousness could be an itself valueless side product of a development that was actually useful (some neat brain process, perhaps), but the consciousness itself plays no functional role. One important implication of this would be that it would not be possible to identify consciousness based on behaviour, because it would not affect behaviour. This is the idea of epiphenomenalism—basically, that there is a process running in your brain that is actually what matters for your behaviour, but the process of its running also, on the side, leads to a subjective experience, which is itself irrelevant—just generated, the way that a locomotive produces steam. While epiphenomenalism leads you into absolutely absurd territory (zombies), there are a surprising number of scientists who have historically essentially prescribed to it, because it allows you to circumvent a bunch of hard questions. You can continue to imagine consciousness as a mysterious, unphysical thing that does not have to be translated into math, because it does not really exist on a physical level—you describe a physical process, and then at some point, you handwave.
However, epiphenomenalism is false. It falls prey to the self-stultification argument; the very fact that we are talking about it implies that it is false. Because if consciousness has no function, is just a side effect that does not itself affect the brain, it cannot affect behaviour. But talking is behaviour, and we are talking intensely about a phenomenon our brain, which controls the speaking should have zero awareness of.
Taking this seriously means concluding that consciousness is not produced by a brain process, the result or side effect of a brain process, but identical with particular kinds of neural/information processing. Which is one of those statements that it is easy to agree with (it seems an obvious choice for a physicalist), but when you try to actually understand it, you get a headache (or at least, I do.) Because it means you can never handwave. You can never have a process on one side, and then go “anyhow, and this leads to consciousness arising” as something separate, but it means that as you are studying the process, you are looking at consciousness itself, from the outside.
***
Northoff & Lamme, like a bunch of neuroscientists, avoid philosophical terminology like the plague, so as a philosopher wanting to use their works, you need to yourself piece together which phenomena they were working towards. Their essential position is that philosophers are people who muck around while avoiding the actual empirical work, and that associating with them is icky. This has the unfortunate consequence that their terminology is horribly vague—Lamme by himself uses “consciousness” for all sorts of stuff. As someone who works on visual processing, I think Lamme also dislikes the word “qualia” for a more justified reason—the idea that the building blocks of consciousness are individual subjective experiences like “red” is nonsense. Our conscious perception of a lilly lake looks nothing like Monet painting. We don’t consciously see the light colours that are hitting our retina as a separate kaleidoscope—we see the whole objects, in what we assume are colours corresponding to their surface properties, with additional information given on potential factors making the colour perception unreliable—itself the result of a long sequence of unconscious processing.
That said, he does place qualia in the correct context. A point he is making there is that neural theories that seem to disagree a lot are targeting different aspects of consciousness, but increasingly look like they can be slotted together into a coherent theory. E.g. Lamme’s ideas and global workspace have little in common, but they focus on different phases—a distinction that I think most corresponds with the distinction between phenomenal and access consciousness. I agree with you that the latter is better understood at this point than the former, though there are good reasons for that—it is incredibly hard to empirically distinguish between precursors of consciousness and the formation of consciousness prior to it being committed to short term memory, and introspective reports for verification start fucking everything up (because introspecting about the stimulus completely changes what is going on phenomenally and neurally), while no report paradigms have other severe difficulties.
But we are still beginning to narrow down how it works—ineptly, sure, and a lot of it is going “ah, this person no longer experiences x, and their brain is damaged in this particular fashion, so something about this damage must have interrupted the relevant process”, while others essentially amount to trying to put people into controlled environments and showing them specifically varied stimuli and scanning them to see what changes (with difficulties in the resolution being terrible, and more difficulties in the fact that people start thinking about other stuff during boring neuroscience experiments), but it is no longer a complete blackbox.
And I would say that Lamme does focus on the phenomenal aspect of things—like I said, not individual colours, but the subjective experience of vision, yes.
And we have also made progress on qualia (e.g. likely ruling out inverse qualia scenarios), see the work Kawakita et al. are doing, which is being discussed here on Less Wrong. https://www.lesswrong.com/posts/LYgJrBf6awsqFRCt3/is-red-for-gpt-4-the-same-as-red-for-you It’s part of a larger line of research looking to accurately jot down psychophysical explanations on colour qualia to built phenomenal maps, and then looking for something correlated in the brain. That still leaves us unsure how and why you see anything at all consciously, but is progress on why the particular thing you are seeing is green and not blue.
Honestly, my TL,DR is that saying that we know nothing about the structure of reality that constitutes consciousness is increasingly unfair in light of how much we do meanwhile understand. We aren’t done, but we have made tangible progress on the question, we have fragments that are beginning to slot into place. Most importantly, we are going away from “how this experience arises will be forever a mystery” to increasingly concrete, solvable questions. I think we started the way the ancient Greeks did—just pointing at what we saw, the “star” in the evening, and the “star” in the morning, not knowing what was causing that visual, the way we go “I subjectively experience x, but no idea why”—but then progressing to realising that they had the same origin, then that the origin was not in fact a star, etc. Starting with a perception, and then looking at its origin—but in our case, the origin we were interested in was not the object being perceived, but the process of perception.
Which videogame has NPCs that can genuinely pass the Turing test?
There is no known video game that has NPCs that can fully pass the Turing test as of yet, as it requires a level of artificial intelligence that has not been achieved.
The above text written by ChatGPT, but you probably guessed that already. The prompt was exactly your question.
A more serious reply: Suppose you used one of the current LLMs to drive a videogame NPC. I’m sure game companies must be considering this. I’d be interested to know if any of them have made it work, for the sort of NPC whose role in the game is e.g. to give the player some helpful information in return for the player completing some mini-quest. The problem I anticipate is the pervasive lack of “definiteness” in ChatGPT. You have to fact-check and edit everything it says before it can be useful. Can the game developer be sure that the LLM acting without oversight will reliably perform its part in that PC-NPC interaction?
Something a bit like this has actually been done, with a proper scientific analysis, but without human players so far. (Or at least I am not aware of the latter, but I frankly can no longer keep up with all the applications.)
They (Park et al 2023 https://arxiv.org/abs/2304.03442 ) populated a tiny, Sims-style world with ChatGPT-controlled AIs, enabled them to store a complete record of agent interactions in natural language, synthesise them into conclusions, and draw upon them to generate behaviours—and let them interact with each other. Not only did they not go of the rails—they performed daily routines, and improvised in a a matter consistent with their character backstories when they ran into each other, eerily like in Westworld. It also illustrated another interesting point that Westworld had made—the strong impact of the ability to form memories on emergent, agentic behaviours.
The thing that stood out is that characters within the world managed to coordinate a party—come up with the idea that one should have one, where it would be, when it would be, inform each other that such a decision had been taken, invite each other, invite friends of friends—and that a bunch of them showed up in the correct location on time. The conversations they were having affected their actions appropriately. There is not just a complex map of human language that is self-referential; there are also references to another set of actions, in this case, navigating this tiny world. It does not yet tick the biological and philosophical boxes for characteristics that have us so interested in embodiment, but it definitely adds another layer.
And then we have analysis of and generation of pictures, which, in turn, is also related to the linguistic maps. One thing that floored me was an example from a demo by OpenAI itself where ChatGPT was shown an image of a heavy object, I think a car, that had a bunch of balloons tied to it with string, balloons which were floating—probably filled with helium. It was given the picture and the question “what happens if the strings are cut” and correctly answered “the balloons would fly away”.
It was plausible to me that ChatGPT cannot possibly know what words mean when just trained on words alone. But the fact that we also have training on images, and actions, and they connect these appropriately… They may not have complete understanding (e.g. the distinction between completely hypothetical states, states that are assumed given within a play context, and states that are externally fixed, seems extremely fuzzy—unsurprising, insofar as ChatGPT has never had unfiltered interactions with the physical world, and was trained so extensively on fiction) but I find it increasingly unconvincing to speak of no understanding in light of this.
Character ai used to have bots good enough to pass. (ChatGPT doesn’t pass, since it was finetuned and prompted to be a robotic assistant.)