“Qualia” is pretty ill-defined, if you try to define it you get things like “compressing sense-data” or “doing meta-cognition” or “having lots of integrated knowledge” or something similar, and these are convergent instrumental goals.
If you try to define qualia without having any darned idea of what they are, you’ll take wild stabs into the dark, and hit simple targets that are convergently instrumental; and if you are at all sensible of your confusion, you will contemplate these simple-sounding definitions and find that none of them particularly make you feel less confused about the mysterious redness of red, unless you bully your brain into thinking that it’s less confused or just don’t know what it would feel like to be less confused. You should in this case trust your sense, if you can find it, that you’re still confused, and not believe that any of these instrumentally convergent things are qualia.
I don’t know how everyone else on LessWrong feels but I at least am getting really tired of you smugly dismissing others’ attempts at moral reductionism wrt qualia by claiming deep philosophical insight you’ve given outside observers very little reason to believe you have. In particular, I suspect if you’d spent half the energy on writing up these insights that you’ve spent using the claim to them as a cudgel you would have at least published enough of a teaser for your claims to be credible.
But here Yudkowsky gave a specific model for how qualia, and other things in the reference class “stuff that’s pointing at something but we’re confused about what”, is mistaken for convergently instrumental stuff. (Namely: pointers point both to what they’re really trying to point to, but also somewhat point at simple things, and simple things tend to be convergently instrumental.) It’s not a reduction of qualia, and a successful reduction of qualia would be much better evidence that an unsuccessful reduction of qualia is unsuccessful, but it’s still a logical relevant argument and a useful model.
I’d love to read an EY-writeup of his model of consciousness, but I don’t see Eliezer invoking ‘I have a secret model of intelligence’ in this particular comment. I don’t feel like I have a gears-level understanding of what consciousness is, but in response to ‘qualia must be a convergently instrumental because it probably involves one or more of (Jessica’s list)’, these strike me as perfectly good rejoinders even if I assume that neither I nor anyone else in the conversation has a model of consciousness:
Positing that qualia involves those things doesn’t get rid of the confusion re qualia.
Positing that qualia involve only simple mechanisms that solve simple problems (hence more likely to be convergently instrumental) is a predictable bias of early wrong guesses about the nature of qualia, because the simple ideas are likely to come to mind first, and will seem more appealing when less of our map (with the attendant messiness and convolutedness of reality) is filled in.
E.g., maybe humans have qualia because of something specific about how we evolved to model other minds. In that case, I wouldn’t start with a strong prior that qualia are convergently instrumental (even among mind designs developed under selection pressure to understand humans). Because there are lots of idiosyncratic things about how humans do other-mind-modeling and reflection (e.g., the tendency to feel sad yourself when you think about a sad person) that are unlikely to be mirrored in superintelligent AI.
Eliezer clearly is implying he has a ‘secret model of qualia’ in another comment:
I am just plain skeptical that there is a real values difference that would survive their learning what I know about how minds and qualia work. I of course fully expect that these people will loudly proclaim that I could not possibly know anything they don’t, despite their own confusion about these matters that they lack the skill to reflect on as confusion, and for them to exchange some wise smiles about those silly people who think that people disagree because of mistakes rather than values differences.
Regarding the rejoinders, although I agree Jessica’s comment doesn’t give us convincing proof that qualia are instrumentally convergent, I think it does give us reason to assign non-negligible probability to that being the case, absent convincing counterarguments. Like, just intuitively—we have e.g. feelings of pleasure and pain, and we also have evolved drives leading us to avoid or seek certain things, and it sure feels like those feelings of pleasure/pain are key components of the avoidance/seeking system. Yes, this could be defeated by a convincing theory of consciousness, but none has been offered, so I think it’s rational to continue assigning a reasonably high probability to qualia being convergent. Generally speaking this point seems like a huge gap in the “AI has likely expected value 0” argument so it would be great if Eliezer could write up his thoughts here.
Eliezer has said tons of times that he has a model of qualia he hasn’t written up. That’s why I said:
I’d love to read an EY-writeup of his model of consciousness, but I don’t see Eliezer invoking ‘I have a secret model of intelligence’ in this particular comment.
The model is real, but I found it weird to reply to that specific comment asking for it, because I don’t think the arguments in that comment rely at all on having a reductive model of qualia.
I think it does give us reason to assign non-negligible probability to that being the case, absent convincing counterarguments.
I started writing a reply to this, but then I realized I’m confused about what Eliezer meant by “Not sure there’s anybody there to see it. Definitely nobody there to be happy about it or appreciate it. I don’t consider that particularly worthwhile.”
He’s written a decent amount about ensuring AI is nonsentient as a research goal, so I guess he’s mapping “sentience” on to “anybody there to see it” (which he thinks is at least plausible for random AGIs, but not a big source of value on its own), and mapping “anybody there to be happy about it or appreciate it” on to human emotions (which he thinks are definitely not going to spontaneously emerge in random AGIs).
I agree that it’s not so-unlikely-as-to-be-negligible that a random AGI might have positively morally valenced (relative to human values) reactions to a lot of the things it computes, even if the positively-morally-valenced thingies aren’t “pleasure”, “curiosity”, etc. in a human sense.
Though I think the reason I believe that doesn’t route through your or Jessica’s arguments; it’s just a simple ‘humans have property X, and I don’t understand what X is or why it showed up in humans, so it’s hard to reach extreme confidence that it won’t show up in AGIs’.
I expect the quaila a paperclip maximizer has, if it has any, to be different enough from humans that it doesn’t capture what I value particularly well.
“Qualia” is pretty ill-defined, if you try to define it you get things like “compressing sense-data” or “doing meta-cognition” or “having lots of integrated knowledge” or something similar, and these are convergent instrumental goals.
None of those are definitions of qualia with any currency. Some of them sound like extant theories of consciousness (not necessarily phenomenal consciousness).
“Qualia” lacks a functional definition, but there is no reason why it should have one, since functionalism in all things in not apriori necessary truth. Indeed, the existence of stubbornly non-functional thingies could be taken as a disproof of functionalism ,if you have a taste for basing theories on evidence.
Are you saying it has a non-functional definition? What might that be, and would it allow for zombies? If it doesn’t have a definition, how is it semantically meaningful?
“Qualia” is pretty ill-defined, if you try to define it you get things like “compressing sense-data” or “doing meta-cognition” or “having lots of integrated knowledge” or something similar, and these are convergent instrumental goals.
If you try to define qualia without having any darned idea of what they are, you’ll take wild stabs into the dark, and hit simple targets that are convergently instrumental; and if you are at all sensible of your confusion, you will contemplate these simple-sounding definitions and find that none of them particularly make you feel less confused about the mysterious redness of red, unless you bully your brain into thinking that it’s less confused or just don’t know what it would feel like to be less confused. You should in this case trust your sense, if you can find it, that you’re still confused, and not believe that any of these instrumentally convergent things are qualia.
I don’t know how everyone else on LessWrong feels but I at least am getting really tired of you smugly dismissing others’ attempts at moral reductionism wrt qualia by claiming deep philosophical insight you’ve given outside observers very little reason to believe you have. In particular, I suspect if you’d spent half the energy on writing up these insights that you’ve spent using the claim to them as a cudgel you would have at least published enough of a teaser for your claims to be credible.
But here Yudkowsky gave a specific model for how qualia, and other things in the reference class “stuff that’s pointing at something but we’re confused about what”, is mistaken for convergently instrumental stuff. (Namely: pointers point both to what they’re really trying to point to, but also somewhat point at simple things, and simple things tend to be convergently instrumental.) It’s not a reduction of qualia, and a successful reduction of qualia would be much better evidence that an unsuccessful reduction of qualia is unsuccessful, but it’s still a logical relevant argument and a useful model.
I’d love to read an EY-writeup of his model of consciousness, but I don’t see Eliezer invoking ‘I have a secret model of intelligence’ in this particular comment. I don’t feel like I have a gears-level understanding of what consciousness is, but in response to ‘qualia must be a convergently instrumental because it probably involves one or more of (Jessica’s list)’, these strike me as perfectly good rejoinders even if I assume that neither I nor anyone else in the conversation has a model of consciousness:
Positing that qualia involves those things doesn’t get rid of the confusion re qualia.
Positing that qualia involve only simple mechanisms that solve simple problems (hence more likely to be convergently instrumental) is a predictable bias of early wrong guesses about the nature of qualia, because the simple ideas are likely to come to mind first, and will seem more appealing when less of our map (with the attendant messiness and convolutedness of reality) is filled in.
E.g., maybe humans have qualia because of something specific about how we evolved to model other minds. In that case, I wouldn’t start with a strong prior that qualia are convergently instrumental (even among mind designs developed under selection pressure to understand humans). Because there are lots of idiosyncratic things about how humans do other-mind-modeling and reflection (e.g., the tendency to feel sad yourself when you think about a sad person) that are unlikely to be mirrored in superintelligent AI.
Eliezer clearly is implying he has a ‘secret model of qualia’ in another comment:
Regarding the rejoinders, although I agree Jessica’s comment doesn’t give us convincing proof that qualia are instrumentally convergent, I think it does give us reason to assign non-negligible probability to that being the case, absent convincing counterarguments. Like, just intuitively—we have e.g. feelings of pleasure and pain, and we also have evolved drives leading us to avoid or seek certain things, and it sure feels like those feelings of pleasure/pain are key components of the avoidance/seeking system. Yes, this could be defeated by a convincing theory of consciousness, but none has been offered, so I think it’s rational to continue assigning a reasonably high probability to qualia being convergent. Generally speaking this point seems like a huge gap in the “AI has likely expected value 0” argument so it would be great if Eliezer could write up his thoughts here.
Eliezer has said tons of times that he has a model of qualia he hasn’t written up. That’s why I said:
The model is real, but I found it weird to reply to that specific comment asking for it, because I don’t think the arguments in that comment rely at all on having a reductive model of qualia.
I started writing a reply to this, but then I realized I’m confused about what Eliezer meant by “Not sure there’s anybody there to see it. Definitely nobody there to be happy about it or appreciate it. I don’t consider that particularly worthwhile.”
He’s written a decent amount about ensuring AI is nonsentient as a research goal, so I guess he’s mapping “sentience” on to “anybody there to see it” (which he thinks is at least plausible for random AGIs, but not a big source of value on its own), and mapping “anybody there to be happy about it or appreciate it” on to human emotions (which he thinks are definitely not going to spontaneously emerge in random AGIs).
I agree that it’s not so-unlikely-as-to-be-negligible that a random AGI might have positively morally valenced (relative to human values) reactions to a lot of the things it computes, even if the positively-morally-valenced thingies aren’t “pleasure”, “curiosity”, etc. in a human sense.
Though I think the reason I believe that doesn’t route through your or Jessica’s arguments; it’s just a simple ‘humans have property X, and I don’t understand what X is or why it showed up in humans, so it’s hard to reach extreme confidence that it won’t show up in AGIs’.
I expect the quaila a paperclip maximizer has, if it has any, to be different enough from humans that it doesn’t capture what I value particularly well.
None of those are definitions of qualia with any currency. Some of them sound like extant theories of consciousness (not necessarily phenomenal consciousness).
“Qualia” lacks a functional definition, but there is no reason why it should have one, since functionalism in all things in not apriori necessary truth. Indeed, the existence of stubbornly non-functional thingies could be taken as a disproof of functionalism ,if you have a taste for basing theories on evidence.
Are you saying it has a non-functional definition? What might that be, and would it allow for zombies? If it doesn’t have a definition, how is it semantically meaningful?
It has a standard definition which you can look up in standard references works.
It’s unreasonable to expect a definition to answer every possible question by itself.