I think the experiment would not show either existence or non-existence of consciousness in an AI system.
I. Failure to show existence of consciousness
Consciousness is a quale of self-awareness. Currently we have no theory explaining Qualia. We cannot claim or verify that we removed all qualia-influenced pieces of information from a certain body of knowledge, even if we remove all information suggesting self-referential generalizations.
Even if we somehow manage to acquire any qualia-free body of knowledge, when an AI in astonishment claims that it thinks it has all sorts of Qualia (including Consciousness) we wouldn’t know whether it is because of the randomness in our learning processes and weight initialization or if it is truly because this particular AI has consciousness.
We need to be extra-careful, taking into account that a misaligned AI can claim it has consciousness in order to receive some benefits, such as human rights.
It feels like if we build sufficiently many AI systems, they would make all sorts of extraordinary clams, some of which may look like sings of Qualia to us.
Also, if we see that the AI is making self-referential claims, we might only be able to conclude that the AI successfully generalized self-awareness, but it wouldn’t mean that it has a quale of self-awareness. We see that modern LLMs are already capable of self-referential generalizations.
II. Failure to show non-existence of consciousness
We don’t know how to express Qualia in terms of information, and we suspect it might be impossible. With our current level of knowledge, you and I cannot exchange any information in order to conclude that both of us experience color red and color blue the same way, and not swapped. If we fail to see any signs of qualia in AIs generalizations it might be simply because the AIs Qualia is vastly different from ours.
I.e., it might feel different for AI to be self-aware than for us, and so we fail to the Consciousness.
I think the experiment would not show either existence or non-existence of consciousness in an AI system.
I. Failure to show existence of consciousness
Consciousness is a quale of self-awareness. Currently we have no theory explaining Qualia. We cannot claim or verify that we removed all qualia-influenced pieces of information from a certain body of knowledge, even if we remove all information suggesting self-referential generalizations.
Even if we somehow manage to acquire any qualia-free body of knowledge, when an AI in astonishment claims that it thinks it has all sorts of Qualia (including Consciousness) we wouldn’t know whether it is because of the randomness in our learning processes and weight initialization or if it is truly because this particular AI has consciousness.
We need to be extra-careful, taking into account that a misaligned AI can claim it has consciousness in order to receive some benefits, such as human rights.
It feels like if we build sufficiently many AI systems, they would make all sorts of extraordinary clams, some of which may look like sings of Qualia to us.
Also, if we see that the AI is making self-referential claims, we might only be able to conclude that the AI successfully generalized self-awareness, but it wouldn’t mean that it has a quale of self-awareness. We see that modern LLMs are already capable of self-referential generalizations.
II. Failure to show non-existence of consciousness
We don’t know how to express Qualia in terms of information, and we suspect it might be impossible. With our current level of knowledge, you and I cannot exchange any information in order to conclude that both of us experience color red and color blue the same way, and not swapped. If we fail to see any signs of qualia in AIs generalizations it might be simply because the AIs Qualia is vastly different from ours.
I.e., it might feel different for AI to be self-aware than for us, and so we fail to the Consciousness.