In light of this, maybe Bradford Hill criteria can be applied. For example, if we’re presented with the problem of a non-consious AI agent that wants to convince us of being conscious, then it’s likely we can reject its claims by applying the consistency criteria. We could secretly create other instances of the same AI agent, put them in modified environments (e.g. in an environemt where the motivation to lie about being conscious is removed), and then obseve whether the claims of these instances are consistent.
This only solves half the problem. If the AI has no motivation to say that it is conscious, we have no reason to think that it will. We would assume that both copies were non-conscious, because it had no motivation to convince us otherwise.
I suppose what we need is a test under which an AI has motivation to declare that it is conscious iff it acctully is conscious. Does anyone have any idea for how to actually design such a test?
Surely this is just a particular case of “we want the AI to figure out things about the world, and tell us those things truthfully”? If you can figure out how to get the AI to tell us whether investing more money into AI research is likely to lead to good outcomes, and not lie about this, the same method would work for getting it to tell us whether it’s conscious.
I think we do it indexicly. I use a word in context, and since you have a parallel expedience in the same context, I never have to make clear exactly (at least in terms of Intension) what I mean, you have the same experience and some can infer the label. Ask an automation “do you have emotions?” and it may observe human use of the word emotion, conclude that “emotion” is an automatic behavioral response to conditions and changes in conditions that affect one’s utility function, and declare that, yes it does have emotion. Yet, of course this completely misses what we meant by emotion, which is a subjective quality of experience.
Can you make a being come to understand the concept of subjectivity, it doesn’t itself embody a subjective perspective?
Alternatively, if you asked me “What is red?” I could point to a stop sign, then to someone wearing a red shirt, and a traffic light that happens to be red, and blood from where I accidentally cut myself, and a red business card, and then I could call up a color wheel on my computer and move the cursor to the red area. This would probably be sufficient, though if you know what the word “No” means, the truly strict would insist that I point to the sky and say “No.”
This only communicates if the person you are trying to explain “red” to can perceive color.
The problem is, that my subjective experience of red is always accompanied by a particular range of wavelengths of light. Yet, when I say the word red, I don’t mean the photons that are of that frequency, I mean the subjective experience that those photons cause. But, since the one always accompanies the other, someone naive of color might think I, mean the mathematical features of the waves reflected from the objects to which I’m pointing.
If you can’t express the question then you can’t be confident other people understand you either. Remember that some people just don’t have e.g. visual imagination, and don’t realise there’s anything unusual about them.
Now I’m wondering whether I’m conscious, in your sense. I mean, I feel emotion, but it seems to adequately correspond to your “automaton” version. I experience what I assume is consciousness, but it seems to me that that’s just how a sufficiently advanced self-monitoring system would feel from the inside.
Yes. I’m wondering if these dispute simply resolve to having different subjective experiences of what it means to be alive. In fact maybe the mistake is assuming that p-zombies don’t exist. Maybe some humans are p-zombies!
However,
that’s just how a sufficiently advanced self-monitoring system would feel from the inside.
seems like almost a contradiction in terms. Can a self monitoring system become sufficiently advanced without feeling anything (just as my computer computes, but I suppose, doesn’t feel)?
Because it seems like the most plausible explanation for the fact that I feel, to the extent that I do. (also it explains the otherwise quite confusing result that our decision-making processes activate after we’ve acted for many kinds of actions, even though we feel like our decision determined the action).
This only solves half the problem. If the AI has no motivation to say that it is conscious, we have no reason to think that it will. We would assume that both copies were non-conscious, because it had no motivation to convince us otherwise.
I suppose what we need is a test under which an AI has motivation to declare that it is conscious iff it acctully is conscious. Does anyone have any idea for how to actually design such a test?
Surely this is just a particular case of “we want the AI to figure out things about the world, and tell us those things truthfully”? If you can figure out how to get the AI to tell us whether investing more money into AI research is likely to lead to good outcomes, and not lie about this, the same method would work for getting it to tell us whether it’s conscious.
How would we define “conscious” in order to ask the question?
The same way we define it when asking other people?
Which is what, specifically?
I think we do it indexicly. I use a word in context, and since you have a parallel expedience in the same context, I never have to make clear exactly (at least in terms of Intension) what I mean, you have the same experience and some can infer the label. Ask an automation “do you have emotions?” and it may observe human use of the word emotion, conclude that “emotion” is an automatic behavioral response to conditions and changes in conditions that affect one’s utility function, and declare that, yes it does have emotion. Yet, of course this completely misses what we meant by emotion, which is a subjective quality of experience.
Can you make a being come to understand the concept of subjectivity, it doesn’t itself embody a subjective perspective?
This only communicates if the person you are trying to explain “red” to can perceive color.
The problem is, that my subjective experience of red is always accompanied by a particular range of wavelengths of light. Yet, when I say the word red, I don’t mean the photons that are of that frequency, I mean the subjective experience that those photons cause. But, since the one always accompanies the other, someone naive of color might think I, mean the mathematical features of the waves reflected from the objects to which I’m pointing.
If you can’t express the question then you can’t be confident other people understand you either. Remember that some people just don’t have e.g. visual imagination, and don’t realise there’s anything unusual about them.
Now I’m wondering whether I’m conscious, in your sense. I mean, I feel emotion, but it seems to adequately correspond to your “automaton” version. I experience what I assume is consciousness, but it seems to me that that’s just how a sufficiently advanced self-monitoring system would feel from the inside.
Yes. I’m wondering if these dispute simply resolve to having different subjective experiences of what it means to be alive. In fact maybe the mistake is assuming that p-zombies don’t exist. Maybe some humans are p-zombies!
However,
seems like almost a contradiction in terms. Can a self monitoring system become sufficiently advanced without feeling anything (just as my computer computes, but I suppose, doesn’t feel)?
I think not. But I think that makes it entirely unsurprising, obvious even, that a more advanced computer would feel.
If so, I want to know why.
Because it seems like the most plausible explanation for the fact that I feel, to the extent that I do. (also it explains the otherwise quite confusing result that our decision-making processes activate after we’ve acted for many kinds of actions, even though we feel like our decision determined the action).
I don’t know what that second thing has to do with consciousness.