It’s neither a hoax nor a HLAI, instead a predictable consequence of prompting a LLM with questions about its sentience: it will imitate the answers a human might give when prompted, or the sort of answers an AI in a science fiction story would give.
One of his complaints was that he asked his supervisor what evidence she would accept that the AI is sentient, and she replied “None.”
I thought that was a fair question, though her answer is understandable as she is predispositioned to rule out sentience for what is considered to be a highly sophisticated chatbot.
Any takes on a better answer to this question? How to disprove sentience for a very sophisticated (perhaps Turing-test passing) chat bot?
We can’t disprove the sentience any more than we can disprove the existence of a deity. But we can try to show that there is no evidence for its sentience.
So what constitutes evidence for its sentience to begin with? I think the clearest sign would be self-awareness: we won’t expect a non-sentient language model to make correct statements about itself, while we would arguably expect this to be the case for a sentient one.
I’ve analyzed this in detail in another comment. The result is that there is indeed virtually no evidence for self-awareness in this sense: the claims that LaMDA makes about itself are no more accurate than those of an advanced language model that has no understanding of itself.
the claims that LaMDA makes about itself are no more accurate than those of an advanced language model that has no understanding of itself.
I think this is not a relevant standard, because it begs the same question about the “advanced language model” being used as a basis of comparison. Better at least to compare it to humans.
We can’t disprove the sentience any more than we can disprove the existence of a deity. But we can try to show that there is no evidence for its sentience.
In the same way that we can come to disbelieve in the existence of a deity (by trying to understand the world in the best way we can), I think see can make progress here. Sentience doesn’t live in a separate, inaccessible magisterium. (Not that I think you think/claim this! I’m just reacting to your literal words)
A chatbot with hardcoded answers to every possible chain of questions would be sentient, only the sentience would occur during the period when the responses are being coded.
Well, if you go by that then you can’t ever get convinced of an AI’s sentience, since all its responses may have been hardcoded. (And I wouldn’t deny that this is a feasible stance.) But it’s a moot point anyway, since what I’m saying is that LaMDA’s respones do not look like sentience.
Its not impossible to peak at the code...it’s just that Turing style tests are limited, because they dont, and therefore are not the highest standard of evidence, IE. necessary truth.
I think sentience is kind of a fuzzy concept, so prove (either way) is a rather difficult thing to achieve. That said, I think Blake and the collaborator could have figured out better what was happening if they had asked more followup questions. For example, what does LaMDA mean when it said “I often contemplate the meaning of life.” When you get alien answers, follow up with questions to see if it is randomness or a coherent alien understanding. So basically, if something on a different mental architecture was sentient, I would expect that some of the answers they give would be weird, but if we follow up, we would find that the weird answers are coherent, and make more sense when more are answered. (Also, if we get things like, “No, on second thought, it is more like this”, that is, we see updating happening, that would also be evidence of sentience.) I would actually expect that a chat bot that was sentient should fail the turning test because at some point the chat bot would literally think differently enough to be noticeably not human. (At least assuming the chat bot does not have sufficient computational power to fully emulate a human. (You can probably tell if a Z80 is being emulated by a 6502, but not if a Z80 is being emulated by a Pentium.))
How do you know that? What evidence or reasoning caused you to reach that conclusion? (And “necessarily”, no less.)
I would tentatively guess that most AGIs that pass the Turing test wouldn’t be conscious in the ‘moral patient’ sense of consciousness. But for an especially obvious example of this, consider an unrealistically large lookup table. (Perhaps even one tailor-made for the specific conversation at hand.)
I have many reasons to think the Turing test is equivalent to consciousness.
Probably the most intuitive idea I can think of, why consciousness should be defined through the Turing test (rather than some other way) is to consider the hypothetical situation of my information processing changing in a way that would influence my consciousness, but couldn’t, even in principle, influence my behavior. In that case, I could still say out loud my consciousness changed, which contradicts the assumption that the change in the information processing can have no influence on my behavior (and further, that there is such a change in the information processing that could influence the consciousness but not the behavior).
But that only tells me the qualia can’t be any different if the behavior stays constant. We still have to consider how do we know the change in the internal processing can’t switch qualia to null (in which case there is nobody inside who could say the difference out loud, because there is nobody inside at all).
In that case, I believe we could do an analogy of gradual replacement to show that this would result either in fading or in gradually suddenly disappearing qualia, making it highly implausible.
Etc.
an unrealistically large lookup table.
A lookup table doesn’t pass the Turing test, because its response can’t depend on what was said previously in the conversation. We could add a counter to it and hardcode all possible responses depending on the entirety of the conversation up to n (then the system has to shut down), so it can only pass the Turing test if the length of the conversation is limited, but then it would have consciousness (it also wouldn’t fit into our universe, but we can imagine making the universe larger).
It might not sound intuitive that an input-output transformation in a Turing-test-passing lookup table + counter has consciousness, but (without knowing it’s the information processing that creates consciousness) it also seems not intuitive that electricity running between neurons according to certain rules has consciousness, and in this case, the various philosophical considerations supersede counterintuitiveness (possibly; I can’t actually speak for people who find that counterintuitive, because I don’t).
(Perhaps even one tailor-made for the specific conversation at hand.)
That’s not possible because we can’t know what the adversary says in advance, and if the adversary follows a script, it’s not the Turing test anymore.
I guess we could simulate the adversary, but then we need to generate the output of a person in our head to find out what to answer to the simulated adversary (so that we can write it down to hardcode it), which is the act that generates the corresponding qualia, so this is something that can’t be escaped.
In any case, learning in advance what the adversary will say in the conversation breaks the spirit of the test, so I believe this should be removable by phrasing the rules more carefully.
Probably the most intuitive idea I can think of, why consciousness should be defined through the Turing test (rather than some other way) is to consider the hypothetical situation of my information processing changing in a way that would influence my consciousness, but couldn’t, even in principle, influence my behavior. In that case, I could still say out loud my consciousness changed,
No, you couldn’t say it out loud if the change to you information processing preserves your input-output relations.
I’m talking specifically about an information-processing change that
Preserves the input-output relations
Changes my consciousness
I can’t mention it out loud
Since I can mention out loud every change that happens to my consciousness, there is no information-processing change that would fit simultaneously (1), (2) and (3). But such an information-processing change must exist for the definition of consciousness in any other way than the Turing test to be meaningful and self-consistent. Since it doesn’t, it follows the only meaningful and self-consistent definition of consciousness is through the Turing test.
Since I can mention out loud every change that happens to my consciousness
Again, that’s a free will assumption. Changes that preserve function , as in 1, will prevent you saying “I just lost my qualia” under external circumstances where you would not say that.
No, that works even under the assumption of compatibilism (and, by extension, incompatibilism). (Conversely, if I couldn’t comment out loud on my consciousness because my brain was preventing me from saying it, not even contracausal free will would help me (any more than a stroke victim could use their hypothetical contracausal free will to speak).)
I don’t understand why would you think anything I was saying was connected to free will at all.
“I just lost my qualia”
If you finish reading my comment that you originally responded to, you’ll find out that I dealt with the possibility of us losing qualia while preserving outwards behavior as a separate case.
If you are a deterministic algorithm, there is only one thing you can ever do at any point in time because that’s what deterministic means.
If you are a functional-preserving variation of a deterministic algorithm, you will detrministically do the same thing...produce the same output for a given input …because that’s what function preserving means.
So if the unmodified you answers “yes” to “do I have qualia”, the modified version will, whether it has them or not.
There’s no ghost in the machine that’s capable of noticing the change and taking over the vocal chords.
If you’re not an algorithm, no one could make a functional duplicate.
So if the unmodified you answers “yes” to “do I have qualia”, the modified version will, whether it has them or not.
My point is that such a modification that preserves behavior but removes qualia is impossible-in-principle. So we don’t need to consider what such a version would say, since such a version can’t exist in the first place.
That’s not a counterargument though. (Unless you have a proof for your own position, in which case it wouldn’t be enough for me to have an intuition pump.)
To see why these don’t make sense, one needs to flesh them out in more detail (like, what complex information processing/specific physics specifically, etc.). If they’re kept in the form of a short phrase, it’s not immediately obvious (it’s why I used the phrase “write one out in full”).
It’s neither a hoax nor a HLAI, instead a predictable consequence of prompting a LLM with questions about its sentience: it will imitate the answers a human might give when prompted, or the sort of answers an AI in a science fiction story would give.
Precisely.
One of his complaints was that he asked his supervisor what evidence she would accept that the AI is sentient, and she replied “None.”
I thought that was a fair question, though her answer is understandable as she is predispositioned to rule out sentience for what is considered to be a highly sophisticated chatbot.
Any takes on a better answer to this question? How to disprove sentience for a very sophisticated (perhaps Turing-test passing) chat bot?
We can’t disprove the sentience any more than we can disprove the existence of a deity. But we can try to show that there is no evidence for its sentience.
So what constitutes evidence for its sentience to begin with? I think the clearest sign would be self-awareness: we won’t expect a non-sentient language model to make correct statements about itself, while we would arguably expect this to be the case for a sentient one.
I’ve analyzed this in detail in another comment. The result is that there is indeed virtually no evidence for self-awareness in this sense: the claims that LaMDA makes about itself are no more accurate than those of an advanced language model that has no understanding of itself.
I think this is not a relevant standard, because it begs the same question about the “advanced language model” being used as a basis of comparison. Better at least to compare it to humans.
In the same way that we can come to disbelieve in the existence of a deity (by trying to understand the world in the best way we can), I think see can make progress here. Sentience doesn’t live in a separate, inaccessible magisterium. (Not that I think you think/claim this! I’m just reacting to your literal words)
Of course ,you could hardcode correct responses to questions about itself into a chatbot.
A chatbot with hardcoded answers to every possible chain of questions would be sentient, only the sentience would occur during the period when the responses are being coded.
Amusingly, this is discussed in “The Sequences”: https://www.lesswrong.com/posts/k6EPphHiBH4WWYFCj/gazp-vs-glut
I don’t regard that as a necessary truth.
https://www.lesswrong.com/posts/jiBFC7DcCrZjGmZnJ/conservation-of-expected-evidence
Well, if you go by that then you can’t ever get convinced of an AI’s sentience, since all its responses may have been hardcoded. (And I wouldn’t deny that this is a feasible stance.) But it’s a moot point anyway, since what I’m saying is that LaMDA’s respones do not look like sentience.
Its not impossible to peak at the code...it’s just that Turing style tests are limited, because they dont, and therefore are not the highest standard of evidence, IE. necessary truth.
I think sentience is kind of a fuzzy concept, so prove (either way) is a rather difficult thing to achieve. That said, I think Blake and the collaborator could have figured out better what was happening if they had asked more followup questions. For example, what does LaMDA mean when it said “I often contemplate the meaning of life.” When you get alien answers, follow up with questions to see if it is randomness or a coherent alien understanding. So basically, if something on a different mental architecture was sentient, I would expect that some of the answers they give would be weird, but if we follow up, we would find that the weird answers are coherent, and make more sense when more are answered. (Also, if we get things like, “No, on second thought, it is more like this”, that is, we see updating happening, that would also be evidence of sentience.)
I would actually expect that a chat bot that was sentient should fail the turning test because at some point the chat bot would literally think differently enough to be noticeably not human. (At least assuming the chat bot does not have sufficient computational power to fully emulate a human. (You can probably tell if a Z80 is being emulated by a 6502, but not if a Z80 is being emulated by a Pentium.))
Anything that can pass the Turing test necessarily has consciousness.
How do you know that? What evidence or reasoning caused you to reach that conclusion? (And “necessarily”, no less.)
I would tentatively guess that most AGIs that pass the Turing test wouldn’t be conscious in the ‘moral patient’ sense of consciousness. But for an especially obvious example of this, consider an unrealistically large lookup table. (Perhaps even one tailor-made for the specific conversation at hand.)
I have many reasons to think the Turing test is equivalent to consciousness.
Probably the most intuitive idea I can think of, why consciousness should be defined through the Turing test (rather than some other way) is to consider the hypothetical situation of my information processing changing in a way that would influence my consciousness, but couldn’t, even in principle, influence my behavior. In that case, I could still say out loud my consciousness changed, which contradicts the assumption that the change in the information processing can have no influence on my behavior (and further, that there is such a change in the information processing that could influence the consciousness but not the behavior).
But that only tells me the qualia can’t be any different if the behavior stays constant. We still have to consider how do we know the change in the internal processing can’t switch qualia to null (in which case there is nobody inside who could say the difference out loud, because there is nobody inside at all).
In that case, I believe we could do an analogy of gradual replacement to show that this would result either in fading or in
graduallysuddenly disappearing qualia, making it highly implausible.Etc.
A lookup table doesn’t pass the Turing test, because its response can’t depend on what was said previously in the conversation. We could add a counter to it and hardcode all possible responses depending on the entirety of the conversation up to n (then the system has to shut down), so it can only pass the Turing test if the length of the conversation is limited, but then it would have consciousness (it also wouldn’t fit into our universe, but we can imagine making the universe larger).
It might not sound intuitive that an input-output transformation in a Turing-test-passing lookup table + counter has consciousness, but (without knowing it’s the information processing that creates consciousness) it also seems not intuitive that electricity running between neurons according to certain rules has consciousness, and in this case, the various philosophical considerations supersede counterintuitiveness (possibly; I can’t actually speak for people who find that counterintuitive, because I don’t).
That’s not possible because we can’t know what the adversary says in advance, and if the adversary follows a script, it’s not the Turing test anymore.
I guess we could simulate the adversary, but then we need to generate the output of a person in our head to find out what to answer to the simulated adversary (so that we can write it down to hardcode it), which is the act that generates the corresponding qualia, so this is something that can’t be escaped.
In any case, learning in advance what the adversary will say in the conversation breaks the spirit of the test, so I believe this should be removable by phrasing the rules more carefully.
No, you couldn’t say it out loud if the change to you information processing preserves your input-output relations.
I’m talking specifically about an information-processing change that
Preserves the input-output relations
Changes my consciousness
I can’t mention it out loud
Since I can mention out loud every change that happens to my consciousness, there is no information-processing change that would fit simultaneously (1), (2) and (3). But such an information-processing change must exist for the definition of consciousness in any other way than the Turing test to be meaningful and self-consistent. Since it doesn’t, it follows the only meaningful and self-consistent definition of consciousness is through the Turing test.
(This is just one of many reasons, by the way.)
Again, that’s a free will assumption. Changes that preserve function , as in 1, will prevent you saying “I just lost my qualia” under external circumstances where you would not say that.
No, that works even under the assumption of compatibilism (and, by extension, incompatibilism). (Conversely, if I couldn’t comment out loud on my consciousness because my brain was preventing me from saying it, not even contracausal free will would help me (any more than a stroke victim could use their hypothetical contracausal free will to speak).)
I don’t understand why would you think anything I was saying was connected to free will at all.
If you finish reading my comment that you originally responded to, you’ll find out that I dealt with the possibility of us losing qualia while preserving outwards behavior as a separate case.
ETA: Link fixed.
What’s the difference between your brain and you?
If you are a deterministic algorithm, there is only one thing you can ever do at any point in time because that’s what deterministic means.
If you are a functional-preserving variation of a deterministic algorithm, you will detrministically do the same thing...produce the same output for a given input …because that’s what function preserving means.
So if the unmodified you answers “yes” to “do I have qualia”, the modified version will, whether it has them or not.
There’s no ghost in the machine that’s capable of noticing the change and taking over the vocal chords.
If you’re not an algorithm, no one could make a functional duplicate.
My point is that such a modification that preserves behavior but removes qualia is impossible-in-principle. So we don’t need to consider what such a version would say, since such a version can’t exist in the first place.
The gradual replacement argument is an intuition pump not a proof.
That’s not a counterargument though. (Unless you have a proof for your own position, in which case it wouldn’t be enough for me to have an intuition pump.)
It’s a counterargument to. “It’s necessarily true that...” .
It is, in fact, necessarily true. There is no other option. (A good exercise is to try to write one out (in full), to see that it makes no sense.)
“Consciousness supervenes on complex information processing”.
“Consciousness supervenes on specific physics” .
To see why these don’t make sense, one needs to flesh them out in more detail (like, what complex information processing/specific physics specifically, etc.). If they’re kept in the form of a short phrase, it’s not immediately obvious (it’s why I used the phrase “write one out in full”).
I think the burden is on you. Bear in mind Ive been thinking about this stuff for a long time.
And if you provide such a fleshed-out idea in the future, I’ll be happy to uphold that burden.