Just as I correctly know it is better to be moral than to be paperclippy, they accurately evaluate that it is more paperclippy to maximize paperclips than morality. They know damn well that they’re making you unhappy and violating your strong preferences by doing so. It’s just that all this talk about the preferences that feel so intrinsically motivating to you, is itself of no interest to them because you haven’t gotten to the all-important parts about paperclips yet.
This is something I’ve been meaning to ask about for a while. When humans say it is moral to satisfy preferences, they aren’t saying that because they have an inbuilt preference for preference-satisfaction (or are they?). They’re idealizing from their preferences for specific things (survival of friends and family, lack of pain, fun...) and making a claim that, ceteris paribus, satisfying preferences is good, regardless of what the preferences are.
Seen in this light, Clippy doesn’t seem like quite as morally orthogonal to us as it once did. Clippy prefers paperclips, so ceteris paribus (unless it hurts us), it’s good to just let it make paperclips. We can even imagine a scenario where it would be possible to “torture” Clippy (e.g., by burning paperclips), and again, I’m willing to pronounce that (again, ceteris paribus) wrong.
Clippy is more of a Lovecraftian horror than a fellow sentient—where by “Lovecraftian” I mean to invoke Lovecraft’s original intended sense of terrifying indifference—but if you want to suppose a Clippy that possesses a pleasure-pain architecture and is sentient and then sympathize with it, I suppose you could. The point is that your sympathy means that you’re motivated by facts about what some other sentient being wants. This doesn’t motivate Clippy even with respect to its own pleasure and pain. In the long run, it has decided, it’s not out to feel happy, it’s out to make paperclips.
Right, that makes sense. What interests me is (a) whether it is possible for Clippy to be properly motivated to make paperclips without some sort of phenomenology of pleasure and pain*, (b) whether human preference-for-preference-satisfaction is just another of many oddball human terminal values, or is arrived at by something more like a process of reason.
Strictly speaking this phrasing puts things awkwardly; my intuition is that the proper motivational algorithms necessarily give rise to phenomenology (to the extent that that word means anything).
it is possible for Clippy to be properly motivated to make paperclips without some sort of phenomenology of pleasure and pain
This is a difficult question, but I suppose that pleasure and pain are a mechanism for human (or other species’) learning. Simply said: you do a random action, and the pleasure/pain response tells you it was good/bad, so you should make more/less of it again.
Clippy could use an architecture with a different model of learning. For example Solomonoff priors and Bayesian updating. In such architecture, pleasure and pain would not be necessary.
but I suppose that pleasure and pain are a mechanism for human (or other species’) learning.
Interesting… I suspect that pleasure and pain are more intimately involved in motivation in general, not just learning. But let us bracket that question.
Clippy could use an architecture with a different model of learning. For example Solomonoff priors and Bayesian updating. In such architecture, pleasure and pain would not be necessary.
Right, but that only gets Clippy the architecture necessary to model the world. How does Clippy’s utility function work?
Now, you can say that Clippy tries to satisfy its utility function by taking actions with high expected cliptility, and that there is no phenomenology necessarily involved in that. All you need, on this view, is an architecture that gives rise to the relevant clip-promoting behaviour—Clippy would be a robot (in the Roomba sense of the word).
BUT
Consider for a moment how symmetrically “unnecessary” it looks that humans (& other sentients) should experience phenomenal pain and pleasure. Just like is supposedly the case with Clippy, all natural selection really “needs” is an architecture that gives rise to the right fitness-promoting behaviour. The “additional” phenomenal character of pleasure and pain is totally unnecessary for us adaptation-executing robots.
...If it seems to you that I might be talking nonsense above, I suspect you’re right. Which is what leads me to the intuition that phenomenal pleasure and pain necessarily fall out of any functional cognitive structure that implements anything analogous to a utility function.
(Assuming that my use of the word “phenomenal” above is actually coherent, of which I am far from sure.)
We know at least two architectures for processing general information: humans and computers. Two data points are not enough to generalize about what all possible architectures must have. But it may be enough to prove what some architectures don’t need. Yes, there is a chance that if computers become even more generally intelligent than today, they will gain some human-like traits. Maybe. Maybe not. I don’t know. And even if they will gain more human-like traits, it may be just because humans designed them without knowing any other way to do it.
If there are two solutions, there are probably many more. I don’t dare to guess how similar or different they are. I imagine that Clippy could be as different from humans and computers, as humans and computers are from each other. Which is difficult to imagine specifically. How far does the mind-space reach? Maybe compared with other possible architectures, humans and computers are actually pretty close to each other (because humans designed the computers, re-using the concepts they were familiar with).
How to taboo “motivation” properly? What makes a rock fall down? Gravity does. But the rock does not follow any alrogithm for general reasoning. What makes a computer follow its algorithm? Well, that’s its construction: the processor reads the data, and the data make it read or write other data, and the algorithm makes it all meaningful. The human brains are full of internal conflicts—there are different modules suggesting different actions, and the reasoning mind is just another plugin which often does not cooperate well with the existing ones. Maybe the pleasure is a signal that a fight between the modules is over. Maybe after millenia of further evolution (if for some magical reason all mind- and body-altering technology would stop working, so only the evolution would change human minds) we would evolve to a species with less internal conflicts, less akrasia, more agency, and perhaps less pleasure and mental pain. This is just a wild guess.
Generalizing from observed characteristics of evolved systems to expected characteristics of designed systems leads equally well to the intuition that humanoid robots will have toenails.
I don’t think the phenomenal character of pleasure and pain is best explained at the level of natural selection at all; the best bet would be that it emerges from the algorithms that our brains implement. So I am really trying to generalize from human cognitive algorithms to algorithms that are analogous in the sense of (roughly) having a utility function.
Suffice it to say, you will find it’s exceedingly hard to find a non-magical reason why non-human cognitive algorithms shouldn’t have a phenomenal character if broadly similar human algorithms do.
Does it follow from the above that all human cognitive algorithms that motivate behavior have the phenomenal character of pleasure and pain? If not, can you clarify why not?
I think that probably all human cognitive algorithms that motivate behaviour have some phenomenal character, not necessarily that of pleasure and pain (e.g., jealousy).
I agree that any cognitive system that implements algorithms sufficiently broadly similar to those implemented in human minds is likely to have the same properties that the analogous human algorithms do, including those algorithms which implement pleasure and pain.
I agree that not all algorithms that motivate behavior will necessarily have the same phenomenal character as pleasure or pain.
This leads me away from the intuition that phenomenal pleasure and pain necessarily fall out of any functional cognitive structure that implements anything analogous to a utility function.
...If it seems to you that I might be talking nonsense above, I suspect you’re right. Which is what leads me to the intuition that phenomenal pleasure and pain necessarily fall out of any functional cognitive structure that implements anything analogous to a utility function.
Necessity according to natural law presumably. If you could write something to show logical necessity, you would have solved the Hard Problem
This is something I’ve been meaning to ask about for a while. When humans say it is moral to satisfy preferences, they aren’t saying that because they have an inbuilt preference for preference-satisfaction (or are they?). They’re idealizing from their preferences for specific things (survival of friends and family, lack of pain, fun...) and making a claim that, ceteris paribus, satisfying preferences is good, regardless of what the preferences are.
Seen in this light, Clippy doesn’t seem like quite as morally orthogonal to us as it once did. Clippy prefers paperclips, so ceteris paribus (unless it hurts us), it’s good to just let it make paperclips. We can even imagine a scenario where it would be possible to “torture” Clippy (e.g., by burning paperclips), and again, I’m willing to pronounce that (again, ceteris paribus) wrong.
Maybe I am confused here...
Clippy is more of a Lovecraftian horror than a fellow sentient—where by “Lovecraftian” I mean to invoke Lovecraft’s original intended sense of terrifying indifference—but if you want to suppose a Clippy that possesses a pleasure-pain architecture and is sentient and then sympathize with it, I suppose you could. The point is that your sympathy means that you’re motivated by facts about what some other sentient being wants. This doesn’t motivate Clippy even with respect to its own pleasure and pain. In the long run, it has decided, it’s not out to feel happy, it’s out to make paperclips.
Right, that makes sense. What interests me is (a) whether it is possible for Clippy to be properly motivated to make paperclips without some sort of phenomenology of pleasure and pain*, (b) whether human preference-for-preference-satisfaction is just another of many oddball human terminal values, or is arrived at by something more like a process of reason.
Strictly speaking this phrasing puts things awkwardly; my intuition is that the proper motivational algorithms necessarily give rise to phenomenology (to the extent that that word means anything).
This is a difficult question, but I suppose that pleasure and pain are a mechanism for human (or other species’) learning. Simply said: you do a random action, and the pleasure/pain response tells you it was good/bad, so you should make more/less of it again.
Clippy could use an architecture with a different model of learning. For example Solomonoff priors and Bayesian updating. In such architecture, pleasure and pain would not be necessary.
Interesting… I suspect that pleasure and pain are more intimately involved in motivation in general, not just learning. But let us bracket that question.
Right, but that only gets Clippy the architecture necessary to model the world. How does Clippy’s utility function work?
Now, you can say that Clippy tries to satisfy its utility function by taking actions with high expected cliptility, and that there is no phenomenology necessarily involved in that. All you need, on this view, is an architecture that gives rise to the relevant clip-promoting behaviour—Clippy would be a robot (in the Roomba sense of the word).
BUT
Consider for a moment how symmetrically “unnecessary” it looks that humans (& other sentients) should experience phenomenal pain and pleasure. Just like is supposedly the case with Clippy, all natural selection really “needs” is an architecture that gives rise to the right fitness-promoting behaviour. The “additional” phenomenal character of pleasure and pain is totally unnecessary for us adaptation-executing robots.
...If it seems to you that I might be talking nonsense above, I suspect you’re right. Which is what leads me to the intuition that phenomenal pleasure and pain necessarily fall out of any functional cognitive structure that implements anything analogous to a utility function.
(Assuming that my use of the word “phenomenal” above is actually coherent, of which I am far from sure.)
We know at least two architectures for processing general information: humans and computers. Two data points are not enough to generalize about what all possible architectures must have. But it may be enough to prove what some architectures don’t need. Yes, there is a chance that if computers become even more generally intelligent than today, they will gain some human-like traits. Maybe. Maybe not. I don’t know. And even if they will gain more human-like traits, it may be just because humans designed them without knowing any other way to do it.
If there are two solutions, there are probably many more. I don’t dare to guess how similar or different they are. I imagine that Clippy could be as different from humans and computers, as humans and computers are from each other. Which is difficult to imagine specifically. How far does the mind-space reach? Maybe compared with other possible architectures, humans and computers are actually pretty close to each other (because humans designed the computers, re-using the concepts they were familiar with).
How to taboo “motivation” properly? What makes a rock fall down? Gravity does. But the rock does not follow any alrogithm for general reasoning. What makes a computer follow its algorithm? Well, that’s its construction: the processor reads the data, and the data make it read or write other data, and the algorithm makes it all meaningful. The human brains are full of internal conflicts—there are different modules suggesting different actions, and the reasoning mind is just another plugin which often does not cooperate well with the existing ones. Maybe the pleasure is a signal that a fight between the modules is over. Maybe after millenia of further evolution (if for some magical reason all mind- and body-altering technology would stop working, so only the evolution would change human minds) we would evolve to a species with less internal conflicts, less akrasia, more agency, and perhaps less pleasure and mental pain. This is just a wild guess.
Generalizing from observed characteristics of evolved systems to expected characteristics of designed systems leads equally well to the intuition that humanoid robots will have toenails.
.
I don’t think the phenomenal character of pleasure and pain is best explained at the level of natural selection at all; the best bet would be that it emerges from the algorithms that our brains implement. So I am really trying to generalize from human cognitive algorithms to algorithms that are analogous in the sense of (roughly) having a utility function.
Suffice it to say, you will find it’s exceedingly hard to find a non-magical reason why non-human cognitive algorithms shouldn’t have a phenomenal character if broadly similar human algorithms do.
Does it follow from the above that all human cognitive algorithms that motivate behavior have the phenomenal character of pleasure and pain? If not, can you clarify why not?
I think that probably all human cognitive algorithms that motivate behaviour have some phenomenal character, not necessarily that of pleasure and pain (e.g., jealousy).
OK, thanks for clarifying.
I agree that any cognitive system that implements algorithms sufficiently broadly similar to those implemented in human minds is likely to have the same properties that the analogous human algorithms do, including those algorithms which implement pleasure and pain.
I agree that not all algorithms that motivate behavior will necessarily have the same phenomenal character as pleasure or pain.
This leads me away from the intuition that phenomenal pleasure and pain necessarily fall out of any functional cognitive structure that implements anything analogous to a utility function.
Necessity according to natural law presumably. If you could write something to show logical necessity, you would have solved the Hard Problem