First recommendation is to get to the bottom of what question you are actually asking. What are you actually trying to do? Do the right thing? Learn how to manipulate people? Learn how to torture? Become a pleasure delivery professional?
(1) What are the necessary and sufficient properties for a thought to be pleasurable?
It feels good? Some pretty heavy neuroscience to say anything beyond that. Again, what are you going to do with the answer to this question. Ask that question instead.
Also note that “necessary and sufficient” is an obsolete model of concepts. See the human’s guide to words.
(2) What are the characteristic mathematics of a painful thought?
What does this mean? How do I calculate exactly how much pain someone will experience if I punch them? Again, ask the real question.
(3) If we wanted to create an artificial neural network-based mind (i.e., using neurons, but not slavishly patterned after a mammalian brain) that could experience bliss, what would the important design parameters be?
Um. Why would you want to do that? Is this simply a hypothetical to see if we understand the concept?
It really depends on what aspect you are interested in; you could create “pleasure” and “pain” by hacking up some kind of simple reinforcement learner, and I suppose you could shoehorn that into a neural network if you really wanted to. But why?
Note that a simple reinforcement learner “experiences” “pain” and “pleasure” in some sense, but not in the morally relevant sense. You will find that the moral aspect is much more anthropomorphic and much more complex, I think.
(4) If we wanted to create an AGI whose nominal reward signal coincided with visceral happiness—how would we do that?
I guess you could have a little “visceral happiness” meter that gets filled up in the right conditions, but this would a profound waste of AGI capability, and probably doesn’t do what you actually wanted. What is it you actually want?
(5) If we wanted to ensure an uploaded mind could feel visceral pleasure of the same kind a non-uploaded mind can, how could we check that?
Ask them? The same way we think we know for non-uploaded minds.
(6) If we wanted to fill the universe with computronium and maximize hedons, what algorithm would we run on it?
If I wanted to turn the universe into paperclips and meaningless crap, how would I do it? Why is your question interesting? Is this simply an excercise in learning how to fill the universe with X? You could pick a less confusing X.
I feel like you might be importing a few mistaken assumptions into this whole line of questioning. I recommend that you lurk more and read some of the stuff I linked.
And if you think certain questions aren’t good, could you offer some you think are?
Good question:
How would a potentially powerful optimizing process have to be constructed to be provably capable of steering towards some coherent objective(s) over the long run and through self-modifications?
I think you’re right that the OP doesn’t quite hit the mark, but you got carried away and started almost wilfully misinterpreting. Especially your answers to 4, 5 and 6.
We seem to be talking past each other, to some degree. To clarify, my six questions were chosen to illustrate how much we don’t know about the mathematics and science behind psychological valence. I tried to have all of them point at this concept, each from a slightly different angle. Perhaps you interpret them as ‘disguised queries’ because you thought my intent was other than to seek clarity about how to speak about this general topic of valence, particularly outside the narrow context of the human brain?
I am not trying to “Learn how to manipulate people? Learn how to torture? Become a pleasure delivery professional?”—my focus is entirely on speaking about psychological valence in clear terms, illustrating that there’s much we don’t know, and make the case that there are empirical questions about the topic that don’t seem to have empirical answers. Also, in very tentative terms, to express the personal belief that a clear theory on exactly what states of affairs are necessary and sufficient for creating pain and pleasure may have some applicability to FAI/AGI topics (e.g., under what conditions can simulated people feel pain?).
I did not find ‘necessary and sufficient’, or any permutation thereof, in the human’s guide to words. Perhaps you’d care to explicate why you didn’t care for my usage?
Re: (3) and (4), I’m certain we’re not speaking of the same things. I recall Eliezer writing about how creating pleasure isn’t as simple as defining a ‘pleasure variable’ and incrementing it:
int pleasure = 5;
pleasure++
I can do that on my macbook pro; it does not create pleasure.
There exist AGIs in design space that have the capacity to (viscerally) feel pleasure, much like humans do. There exist AGIs in design space with a well-defined reward channel. I’m asking: what principles can we use to construct an AGI which feels visceral pleasure when (and only when) its reward channel is activated? If you believe this is trivial, we are not communicating successfully.
I’m afraid we may not share common understandings (or vocabulary) on many important concepts, and I’m picking up a rather aggressive and patronizing vibe, but a genuine thanks for taking the time to type out your comment, and especially the intent in linking that which you linked. I will try not to violate too many community norms here.
I’m not nyan_sandwich, but here is what I believe to be his point about asking for necessary and sufficient conditions.
Part of your question (maybe not all) appears to be: how should we define “pleasure”?
Aside from precise technical definitions (“an abelian group is a set A together with a function from AxA to A, such that …”), the meaning of a word is hardly ever* accurately given by any necessary-and-sufficient conditions that can be stated explicitly in a reasonable amount of space, because that just isn’t the way human minds work.
We learn the meaning of a word by observing how it’s used. We see, and hear, a word like “pleasure” or “pain” applied to various things, and not to others. What our brains do with this is approximately to consider something an instance of “pleasure” in so far as it resembles other things that are called “pleasure”. There’s no reason why any manageable set of necessary and sufficient conditions should be equivalent to that.
Further, different people are exposed to different sets of uses of the word, and evaluate resemblance in different ways. So your idea of “pleasure” may not be the same as mine, and there’s no reason why there need be any definite answer to the question of whose is better.
Typically, lots of different things will contribute to our considering something sufficiently like other instances of “pleasure” to deserve that name itself. In some particular contexts, some will be more important than others. So if you’re trying to pin down a precise definition for “pleasure”, the features you should concentrate on will depend on what that definition is going to be used for.
How should we define “pleasure”? -- A difficult question. As you mention, it is a cloud of concepts, not a single one. It’s even more difficult because there appears to be precious little driving the standardization of the word—e.g., if I use the word ‘chair’ differently than others, it’s obvious, people will correct me, and our usages will converge. If I use the word ‘pleasure’ differently than others, that won’t be as obvious because it’s a subjective experience, and there’ll be much less convergence toward a common usage.
But I’d say that in practice, these problems tend to work themselves out, at least enough for my purposes. E.g., if I say “think of pure, unadulterated agony” to a room of 10000 people, I think the vast majority would arrive at fairly similar thoughts. Likewise, if I asked 10000 people to think of “pure, unadulterated bliss… the happiest moment in your life”, I think most would arrive at thoughts which share certain attributes, and none (<.01%) would invert answers to these two questions.
I find this “we know it when we see it” definitional approach completely philosophically unsatisfying, but it seems to work well enough for my purposes, which is to find mathematical commonalities across brain-states people identify as ‘pleasurable’, and different mathematical commonalities across brain-states people identify as ‘painful’.
I see what you mean by “the meaning of a word is hardly ever accurately given by any necessary-and-sufficient conditions that can be stated explicitly in a reasonable amount of space, because that just isn’t the way human minds work.” On the other hand, all words are imperfect and we need to talk about this somehow. How about this:
(1) what are the characteristic mathematics of (i.e., found disproportionally in) self-identified pleasurable brain states?
Even if it turns out that there is no rigorously definable one-dimensional measure of valence we still need to search for physical correlates to pleasure and pain and find approximate measures to use when resolving moral dilemmas.
Regarding the response to (6), why don’t you want to maximise hedons? Having a rigorous definition of what you are trying to maximise needn’t mean that what you are trying to maximise is arbitrary to you, and that pleasure is complex (or maybe it is simple but we don’t understand it yet) does not imply that we don’t want it.
First recommendation is to get to the bottom of what question you are actually asking. What are you actually trying to do? Do the right thing? Learn how to manipulate people? Learn how to torture? Become a pleasure delivery professional?
See disguised queries
It feels good? Some pretty heavy neuroscience to say anything beyond that. Again, what are you going to do with the answer to this question. Ask that question instead.
Also note that “necessary and sufficient” is an obsolete model of concepts. See the human’s guide to words.
What does this mean? How do I calculate exactly how much pain someone will experience if I punch them? Again, ask the real question.
Um. Why would you want to do that? Is this simply a hypothetical to see if we understand the concept?
It really depends on what aspect you are interested in; you could create “pleasure” and “pain” by hacking up some kind of simple reinforcement learner, and I suppose you could shoehorn that into a neural network if you really wanted to. But why?
Note that a simple reinforcement learner “experiences” “pain” and “pleasure” in some sense, but not in the morally relevant sense. You will find that the moral aspect is much more anthropomorphic and much more complex, I think.
I guess you could have a little “visceral happiness” meter that gets filled up in the right conditions, but this would a profound waste of AGI capability, and probably doesn’t do what you actually wanted. What is it you actually want?
Ask them? The same way we think we know for non-uploaded minds.
If I wanted to turn the universe into paperclips and meaningless crap, how would I do it? Why is your question interesting? Is this simply an excercise in learning how to fill the universe with X? You could pick a less confusing X.
I feel like you might be importing a few mistaken assumptions into this whole line of questioning. I recommend that you lurk more and read some of the stuff I linked.
Good question:
How would a potentially powerful optimizing process have to be constructed to be provably capable of steering towards some coherent objective(s) over the long run and through self-modifications?
Downvote preventers get downvoted.
I think you’re right that the OP doesn’t quite hit the mark, but you got carried away and started almost wilfully misinterpreting. Especially your answers to 4, 5 and 6.
We seem to be talking past each other, to some degree. To clarify, my six questions were chosen to illustrate how much we don’t know about the mathematics and science behind psychological valence. I tried to have all of them point at this concept, each from a slightly different angle. Perhaps you interpret them as ‘disguised queries’ because you thought my intent was other than to seek clarity about how to speak about this general topic of valence, particularly outside the narrow context of the human brain?
I am not trying to “Learn how to manipulate people? Learn how to torture? Become a pleasure delivery professional?”—my focus is entirely on speaking about psychological valence in clear terms, illustrating that there’s much we don’t know, and make the case that there are empirical questions about the topic that don’t seem to have empirical answers. Also, in very tentative terms, to express the personal belief that a clear theory on exactly what states of affairs are necessary and sufficient for creating pain and pleasure may have some applicability to FAI/AGI topics (e.g., under what conditions can simulated people feel pain?).
I did not find ‘necessary and sufficient’, or any permutation thereof, in the human’s guide to words. Perhaps you’d care to explicate why you didn’t care for my usage?
Re: (3) and (4), I’m certain we’re not speaking of the same things. I recall Eliezer writing about how creating pleasure isn’t as simple as defining a ‘pleasure variable’ and incrementing it:
I can do that on my macbook pro; it does not create pleasure.
There exist AGIs in design space that have the capacity to (viscerally) feel pleasure, much like humans do. There exist AGIs in design space with a well-defined reward channel. I’m asking: what principles can we use to construct an AGI which feels visceral pleasure when (and only when) its reward channel is activated? If you believe this is trivial, we are not communicating successfully.
I’m afraid we may not share common understandings (or vocabulary) on many important concepts, and I’m picking up a rather aggressive and patronizing vibe, but a genuine thanks for taking the time to type out your comment, and especially the intent in linking that which you linked. I will try not to violate too many community norms here.
I’m not nyan_sandwich, but here is what I believe to be his point about asking for necessary and sufficient conditions.
Part of your question (maybe not all) appears to be: how should we define “pleasure”?
Aside from precise technical definitions (“an abelian group is a set A together with a function from AxA to A, such that …”), the meaning of a word is hardly ever* accurately given by any necessary-and-sufficient conditions that can be stated explicitly in a reasonable amount of space, because that just isn’t the way human minds work.
We learn the meaning of a word by observing how it’s used. We see, and hear, a word like “pleasure” or “pain” applied to various things, and not to others. What our brains do with this is approximately to consider something an instance of “pleasure” in so far as it resembles other things that are called “pleasure”. There’s no reason why any manageable set of necessary and sufficient conditions should be equivalent to that.
Further, different people are exposed to different sets of uses of the word, and evaluate resemblance in different ways. So your idea of “pleasure” may not be the same as mine, and there’s no reason why there need be any definite answer to the question of whose is better.
Typically, lots of different things will contribute to our considering something sufficiently like other instances of “pleasure” to deserve that name itself. In some particular contexts, some will be more important than others. So if you’re trying to pin down a precise definition for “pleasure”, the features you should concentrate on will depend on what that definition is going to be used for.
Does any of that help?
It does, and thank you for the reply.
How should we define “pleasure”? -- A difficult question. As you mention, it is a cloud of concepts, not a single one. It’s even more difficult because there appears to be precious little driving the standardization of the word—e.g., if I use the word ‘chair’ differently than others, it’s obvious, people will correct me, and our usages will converge. If I use the word ‘pleasure’ differently than others, that won’t be as obvious because it’s a subjective experience, and there’ll be much less convergence toward a common usage.
But I’d say that in practice, these problems tend to work themselves out, at least enough for my purposes. E.g., if I say “think of pure, unadulterated agony” to a room of 10000 people, I think the vast majority would arrive at fairly similar thoughts. Likewise, if I asked 10000 people to think of “pure, unadulterated bliss… the happiest moment in your life”, I think most would arrive at thoughts which share certain attributes, and none (<.01%) would invert answers to these two questions.
I find this “we know it when we see it” definitional approach completely philosophically unsatisfying, but it seems to work well enough for my purposes, which is to find mathematical commonalities across brain-states people identify as ‘pleasurable’, and different mathematical commonalities across brain-states people identify as ‘painful’.
I see what you mean by “the meaning of a word is hardly ever accurately given by any necessary-and-sufficient conditions that can be stated explicitly in a reasonable amount of space, because that just isn’t the way human minds work.” On the other hand, all words are imperfect and we need to talk about this somehow. How about this: (1) what are the characteristic mathematics of (i.e., found disproportionally in) self-identified pleasurable brain states?
“what are the characteristic mathematics of (i.e., found disproportionally in) self-identified pleasurable brain states?”
Certain areas of the brain get more active and certain hormones get into the bloodstream. How does this help you out?
Even if it turns out that there is no rigorously definable one-dimensional measure of valence we still need to search for physical correlates to pleasure and pain and find approximate measures to use when resolving moral dilemmas.
Regarding the response to (6), why don’t you want to maximise hedons? Having a rigorous definition of what you are trying to maximise needn’t mean that what you are trying to maximise is arbitrary to you, and that pleasure is complex (or maybe it is simple but we don’t understand it yet) does not imply that we don’t want it.