Only, it’s not so much a disagreement as it is a value differential. I don’t care the processes by which one achieves happiness. The end results are what matter, and I’ll be damned if I accept having one less hedon or one less utilon out there because of a perceived value in working toward them rather than automatically gaining them. It sounds to me like expecting victims of depression to work through it and experience the joy of overcoming depression, instead of, say, our hypothetical pill that just cures their depression. It is a sadness that nothing like that exists.
At the risk of (further) lowering my own status, I’ll also say that I really really really do wish the “do anything” Star Trek Holodecks were here. Now, it might matter to me that simulated oral sex is not from a real person who made that decision on her evolution-based human terms, but that is another matter of utilons.
Edited to add: perhaps worth noting is that I would have accepted the deal given by the Superhappies in Three Worlds Collide, though I might have tried to argue that the “having humans eat babies as well” thing is not necessary, even knowing I probably would not succeed.
Since you’re differentiating utilons from hedons, doesn’t that kind of follow the thrust of the article? That is, the point that the OP is arguing against is that utilons are ultimately the same thing as hedons; that all people really want is to be happy and that everything else is an instrumental value towards that end.
Your example of the perfect anti-depressant is I think somewhat misleading; the worry when it comes to wire-heading is that you’ll maximize hedons to the exclusion of all other types of utilon. Curing depression is awesome not only because it increases net hedons, but also because depression makes it hard to accomplish anything at all, even stuff that’s about whole other types of utilons.
The basic point of the article seems to be “Not all utilons are (reducible to) hedons”, which confuses me from the start. If happiness is not a generic term for “perception of a utilon-positive outcome”, what is it? I don’t think all utilons can be reduced to hedons, but that’s only because I see no difference between the two. I honestly don’t comprehend the difference between “State A makes me happier than state B” and “I value state A more than state B”. If hedons aren’t exactly equivalent to utilons, what are they?
An example might help:
I was arguing with a classmate of mine recently. My claim was that every choice he made boiled down to the option which made him happiest. Looking back on it, I meant to say it was the option whose anticipation gave him the most happiness, since making choices based on the result of those choices breaks causality.
Anyway, he argued that his choices were not based on happiness. He put forth the example that, while he didn’t enjoy his job, he still went because he needed to support his son. My response was that while his reaction to his job as an isolated experience was negative, his happiness from {job + son eating} was more than his happiness from {no job + son starving}.
I thought at the time that we were disagreeing about basic motivations, but this article and its responses have caused me to wonder if, perhaps, I don’t use the word ‘happiness’ in the standard sense.
Giving a hyperbolic thought excercise:
If I could choose between all existing minds (except mine, to make the point about relative values) experiencing intense agony for a year and my own death, I think I’d be likely to choose my death. This is not because I expect to experience happiness after death, but because considering the state of the universe in the second scenario brings me more happiness than considering the state of the universe in the first. As far as I can tell, this is exactly what it means to place a higher value on the relative pleasure and continuing functionality of all-but-one mind than on my own continued existence.
To anyone who argues that utilons aren’t exactly equivalent to hedons (either that utilons aren’t hedons or that utilons are reducible to hedons), please explain to me what you (and my sudden realisation that you exist allows me to realise you seem amazingly common) think happiness is.
The hedonic scores are identical and, as far as I can tell, the outcomes are identical.
The only difference is if I know about the difference—if, for instance, I’m given a choice between the two. At that point, my consideration of 2 has more hedons than my consideration of 1. Is that different from saying 2 has more utilons than 1?
Is the distinction perhaps that hedons are about now while utilons are overall?
Talking about “utilons” and “hedons” implies that there exists some X such that, by my standards, the world is better with more X in it, whether I am aware of X or not.
Given that assumption, it follows that if you add X to the world in such a way that I don’t interact with it at all, it makes the world better by my standards, but it doesn’t make me happier. One way of expressing that is that X produces utilons but not hedons.
I would not have considered utilons to have meaning without my ability to compare them in my utility function.
You’re saying utilons can be generated without your knowledge, but hedons cannot? Does that mean utilons are a measure of reality’s conformance to your utility function, while hedons are your reaction to your perception of reality’s conformance to your utility function?
I’m saying that something can make the world better without affecting me, but nothing can make me happier without affecting me. That suggests to me that the set of things that can make the world better is different from the set of things that can make me happy, even if they overlap significantly.
That makes sense. I had only looked at the difference within “things that affect my choices”, which is not a full representation of things. Could I reasonably say, then, that hedons are the intersection of “utilons” and “things of which I’m aware”, or is there more to it?
Another way of phrasing what I think you’re saying: “Utilons are where the utility function intersects with the territory, hedons are where the utility function intersects with the map.”
I’m not sure how “hedons” interact with “utilons”. I’m not saying anything at all about how they interact. I’m merely saying that they aren’t the same thing.
It was confusing me, yes. I considered hedons exactly equivalent to utilons.
Then you made your excellent case, and now it no longer confuses me. I revised my definition of happiness from “reality matching the utility function” to “my perception of reality matching the utility function”—which it should have been from the beginning, in retrospect.
I’d still like to know if people see happiness as something other than my new definition, but you have helped me from confusion to non-confusion, at least regarding the presence of a distinction, if not the exact nature thereof.
I have to admit, I’m never exactly sure what people are talking about when they talk about their utility functions. Certainly, if I have a utility function, I don’t know what it is. But I understand it to mean, roughly, that when comparing hypothetical states of the world Wa and Wb, I perform some computation F(W) on each state such that if F(Wa) > F(Wb), then I consider Wa more valuable than Wb.
Is that close enough to what you mean here?
And you are asserting, definitionally, that if that’s true I should also expect that, if I’m fully aware of all the details of Wa and Wb, I will be happier in Wa.
Another way of saying this is that if OW is the reality that I would perceive in a world W, then my happiness in Wa is F(OWa). It simply cannot be the case, on this view, that I consider a proposed state-change in the world to be an improvement, without also being such that I would be made happier by becoming aware of that state-change actually occurring.
Am I understanding you correctly so far?
Further, if I sincerely assert about some state change that I believe it makes the world better, but it makes me less happy, it follows that I’m simply mistaken about my own internal state… either I don’t actually believe it makes the world better, or it doesn’t actually make me less happy, or both.
Did I get that right? Or are you making the stronger claim that I cannot in point of fact ever sincerely assert something like that?
I understand it to mean, roughly, that when comparing hypothetical states of the world Wa and Wb, I perform some computation F(W) on each state such that if F(Wa) > F(Wb), then I consider Wa more valuable than Wb.
That’s precisely what I mean.
Another way of saying this is that if OW is the reality that I would perceive in a world W, then my happiness in Wa is F(OWa). It simply cannot be the case, on this view, that I consider a proposed state-change in the world to be an improvement, without also being such that I would be made happier by becoming aware of that state-change actually occurring.
Yes
Further, if I sincerely assert about some state change that I believe it makes the world better, but it makes me less happy, it follows that I’m simply mistaken about my own internal state… either I don’t actually believe it makes the world better, or it doesn’t actually make me less happy, or both.
Did I get that right? Or are you making the stronger claim that I cannot in point of fact ever sincerely assert something like that?
Hm. I’m not sure what you mean by “sincerely”, if those are different. I would say if you claimed “X would make the universe better” and also “Being aware of X would make me less happy”, one of those statements must be wrong. I think it requires some inconsistency to claim F(Wa+X)>F(Wa) but F(O(Wa+X))F2(O(Wa)), which is relatively common (Pascal’s Wager comes to mind).
What I mean by “sincerely” is just that I’m not lying when I assert it. And, yes, this presumes that X isn’t changing F. I wasn’t trying to be sneaky; my intention was simply to confirm that you believe F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)), and that I hadn’t misunderstood something. And, further, to confirm that you believe that you believe that if F(W) gives the utility of a world-state for some evaluator, then F(O(W)) gives the degree to which that world-state makes that evaluator happy. Or, said more concisely: that H(O(W)) == F(O(W)) for a given observer.
Hm.
So, I agree broadly that F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)). (Although a caveat: it’s certainly possible to come up with combinations of F() and O() for which it isn’t true, so this is more of an evidentiary implication than a logical one. But I think that’s beside our purpose here.)
H(O(W)) = F(O(W)), though, seems entirely unjustified to me. I mean, it might be true, sure, just as it might be true that F(O(W)) is necessarily equal to various other things. But I see no reason to believe it; it feels to me like an assertion pulled out of thin air.
Of course, I can’t really have any counterevidence, the way the claim is structured.
I mean, I’ve certainly had the experience of changing my mind about whether X makes the world better, even though observing X continues to make me equally happy—that is, the experience of having F(Wa+X) - F(Wa) change while H(O(Wa+X)) - H(O((Wa)) stays the same—which suggests to me that F() and H() are different functions… but you would presumably just say that I’m mistaken about one or both of those things. Which is certainly possible, I am far from incorrigible either about what makes me happy and I don’t entirely understand what I believe makes the world better.
I think I have to leave it there. You are asserting an identity that seems unjustified to me, and I have no compelling reason to believe that it’s true, but also no definitive grounds for declaring it false.
I’ve certainly had the experience of changing my mind about whether X makes the world better, even though observing X continues to make me equally happy—that is, the experience of having F(Wa+X) - F(Wa) change while H(O(Wa+X)) - H(O((Wa)) stays the same
but I can’t imagine experiencing that. If the utility of a function goes down, it seems my happiness from seeing that function must necessarily go down as well. This discrepancy causes me to believe there is a low-level difference between what you consider happiness and what I consider happiness, but I can’t explain mine any farther than I already have.
I don’t know how else to say it, but I don’t feel I’m actually making that assertion. I’m just saying:
“By my understanding of hedony=H(x), awareness=O(x), and utility=F(x), I don’t see any possible situation where H(W) =/= F(O(W)). If they’re indistinguishable, wouldn’t it make sense to say they’re the same thing?”
I agree that if two things are indistinguishable in principle, it makes sense to use the same label for both.
It is not nearly as clear to me that “what makes me happy” and “what makes the world better” are indistinguishable sets as it seems to be to you, so I am not as comfortable using the same label for both sets as you seem to be.
You may be right that we don’t use “happiness” to refer to the same things. I’m not really sure how to explore that further; what I use “happiness” to refer to is an experiential state I don’t know how to convey more precisely without in effect simply listing synonyms. (And we’re getting perilously close to “what if what I call ‘red’ is what you call ‘green’?” territory, here.)
Without a much more precise way of describing patterns of neuron-fire, I don’t think either of us can describe happiness more than we have so far. Having discussed the reactions in-depth, though, I think we can reasonably conclude that, whatever they are, they’re not the same, which answers at least part of my initial question.
The subject in detail is too complicated to bother with in this comment thread because it is discussed in much greater detail elsewhere, so I’ll just bring up two things.
1) In the last month I’ve been thinking pretty darned carefully and am now really really unsure whether I’d accept the Superhappies’ deal and am frankly glad I’ll never have to make that choice.
2) Some of my own desires are bad, and if I were to take a pill that completely eliminated those desires, I would. The idea that what humanity wants right now is what it really wants is definitely not certain, as most certainly uncertain as uncertainties get. So the real question is, why does our utility function act the way it does? There was no purpose for it and if we can agree on a way to change it, we should change it, even if that means
In the agonizing process of reading all the Yudkowsky Less Wrong articles, this is the first one I have had any disagreement with whatsoever.
This is coming from a person who was actually convinced by the biased and obsolete 1997 singularity essay by Yudkowsky.
Only, it’s not so much a disagreement as it is a value differential. I don’t care the processes by which one achieves happiness. The end results are what matter, and I’ll be damned if I accept having one less hedon or one less utilon out there because of a perceived value in working toward them rather than automatically gaining them. It sounds to me like expecting victims of depression to work through it and experience the joy of overcoming depression, instead of, say, our hypothetical pill that just cures their depression. It is a sadness that nothing like that exists.
At the risk of (further) lowering my own status, I’ll also say that I really really really do wish the “do anything” Star Trek Holodecks were here. Now, it might matter to me that simulated oral sex is not from a real person who made that decision on her evolution-based human terms, but that is another matter of utilons.
Edited to add: perhaps worth noting is that I would have accepted the deal given by the Superhappies in Three Worlds Collide, though I might have tried to argue that the “having humans eat babies as well” thing is not necessary, even knowing I probably would not succeed.
Since you’re differentiating utilons from hedons, doesn’t that kind of follow the thrust of the article? That is, the point that the OP is arguing against is that utilons are ultimately the same thing as hedons; that all people really want is to be happy and that everything else is an instrumental value towards that end.
Your example of the perfect anti-depressant is I think somewhat misleading; the worry when it comes to wire-heading is that you’ll maximize hedons to the exclusion of all other types of utilon. Curing depression is awesome not only because it increases net hedons, but also because depression makes it hard to accomplish anything at all, even stuff that’s about whole other types of utilons.
The basic point of the article seems to be “Not all utilons are (reducible to) hedons”, which confuses me from the start. If happiness is not a generic term for “perception of a utilon-positive outcome”, what is it? I don’t think all utilons can be reduced to hedons, but that’s only because I see no difference between the two. I honestly don’t comprehend the difference between “State A makes me happier than state B” and “I value state A more than state B”. If hedons aren’t exactly equivalent to utilons, what are they?
An example might help: I was arguing with a classmate of mine recently. My claim was that every choice he made boiled down to the option which made him happiest. Looking back on it, I meant to say it was the option whose anticipation gave him the most happiness, since making choices based on the result of those choices breaks causality. Anyway, he argued that his choices were not based on happiness. He put forth the example that, while he didn’t enjoy his job, he still went because he needed to support his son. My response was that while his reaction to his job as an isolated experience was negative, his happiness from {job + son eating} was more than his happiness from {no job + son starving}.
I thought at the time that we were disagreeing about basic motivations, but this article and its responses have caused me to wonder if, perhaps, I don’t use the word ‘happiness’ in the standard sense.
Giving a hyperbolic thought excercise: If I could choose between all existing minds (except mine, to make the point about relative values) experiencing intense agony for a year and my own death, I think I’d be likely to choose my death. This is not because I expect to experience happiness after death, but because considering the state of the universe in the second scenario brings me more happiness than considering the state of the universe in the first. As far as I can tell, this is exactly what it means to place a higher value on the relative pleasure and continuing functionality of all-but-one mind than on my own continued existence.
To anyone who argues that utilons aren’t exactly equivalent to hedons (either that utilons aren’t hedons or that utilons are reducible to hedons), please explain to me what you (and my sudden realisation that you exist allows me to realise you seem amazingly common) think happiness is.
Consider the following two world states:
A person important to you dies.
They don’t die, but you are given a brain modification that makes it seem to you as though they had.
The hedonic scores for 1 and 2 are identical, but 2 has more utilons if you value your friend’s life.
The hedonic scores are identical and, as far as I can tell, the outcomes are identical. The only difference is if I know about the difference—if, for instance, I’m given a choice between the two. At that point, my consideration of 2 has more hedons than my consideration of 1. Is that different from saying 2 has more utilons than 1?
Is the distinction perhaps that hedons are about now while utilons are overall?
Talking about “utilons” and “hedons” implies that there exists some X such that, by my standards, the world is better with more X in it, whether I am aware of X or not.
Given that assumption, it follows that if you add X to the world in such a way that I don’t interact with it at all, it makes the world better by my standards, but it doesn’t make me happier. One way of expressing that is that X produces utilons but not hedons.
I would not have considered utilons to have meaning without my ability to compare them in my utility function.
You’re saying utilons can be generated without your knowledge, but hedons cannot? Does that mean utilons are a measure of reality’s conformance to your utility function, while hedons are your reaction to your perception of reality’s conformance to your utility function?
I’m saying that something can make the world better without affecting me, but nothing can make me happier without affecting me. That suggests to me that the set of things that can make the world better is different from the set of things that can make me happy, even if they overlap significantly.
That makes sense. I had only looked at the difference within “things that affect my choices”, which is not a full representation of things. Could I reasonably say, then, that hedons are the intersection of “utilons” and “things of which I’m aware”, or is there more to it?
Another way of phrasing what I think you’re saying: “Utilons are where the utility function intersects with the territory, hedons are where the utility function intersects with the map.”
I’m not sure how “hedons” interact with “utilons”.
I’m not saying anything at all about how they interact.
I’m merely saying that they aren’t the same thing.
Oh! I didn’t catch that at all. I apologize.
You’ve made an excellent case for them not being the same. I agree.
Cool. I thought it was confusing you earlier, but perhaps I misunderstood.
It was confusing me, yes. I considered hedons exactly equivalent to utilons.
Then you made your excellent case, and now it no longer confuses me. I revised my definition of happiness from “reality matching the utility function” to “my perception of reality matching the utility function”—which it should have been from the beginning, in retrospect.
I’d still like to know if people see happiness as something other than my new definition, but you have helped me from confusion to non-confusion, at least regarding the presence of a distinction, if not the exact nature thereof.
(nods) Cool.
As for your proposed definition of happiness… hm.
I have to admit, I’m never exactly sure what people are talking about when they talk about their utility functions. Certainly, if I have a utility function, I don’t know what it is. But I understand it to mean, roughly, that when comparing hypothetical states of the world Wa and Wb, I perform some computation F(W) on each state such that if F(Wa) > F(Wb), then I consider Wa more valuable than Wb.
Is that close enough to what you mean here?
And you are asserting, definitionally, that if that’s true I should also expect that, if I’m fully aware of all the details of Wa and Wb, I will be happier in Wa.
Another way of saying this is that if OW is the reality that I would perceive in a world W, then my happiness in Wa is F(OWa). It simply cannot be the case, on this view, that I consider a proposed state-change in the world to be an improvement, without also being such that I would be made happier by becoming aware of that state-change actually occurring.
Am I understanding you correctly so far?
Further, if I sincerely assert about some state change that I believe it makes the world better, but it makes me less happy, it follows that I’m simply mistaken about my own internal state… either I don’t actually believe it makes the world better, or it doesn’t actually make me less happy, or both.
Did I get that right? Or are you making the stronger claim that I cannot in point of fact ever sincerely assert something like that?
That’s precisely what I mean.
Yes
Hm. I’m not sure what you mean by “sincerely”, if those are different. I would say if you claimed “X would make the universe better” and also “Being aware of X would make me less happy”, one of those statements must be wrong. I think it requires some inconsistency to claim F(Wa+X)>F(Wa) but F(O(Wa+X))F2(O(Wa)), which is relatively common (Pascal’s Wager comes to mind).
What I mean by “sincerely” is just that I’m not lying when I assert it.
And, yes, this presumes that X isn’t changing F.
I wasn’t trying to be sneaky; my intention was simply to confirm that you believe F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)), and that I hadn’t misunderstood something.
And, further, to confirm that you believe that you believe that if F(W) gives the utility of a world-state for some evaluator, then F(O(W)) gives the degree to which that world-state makes that evaluator happy. Or, said more concisely: that H(O(W)) == F(O(W)) for a given observer.
Hm.
So, I agree broadly that F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)). (Although a caveat: it’s certainly possible to come up with combinations of F() and O() for which it isn’t true, so this is more of an evidentiary implication than a logical one. But I think that’s beside our purpose here.)
H(O(W)) = F(O(W)), though, seems entirely unjustified to me. I mean, it might be true, sure, just as it might be true that F(O(W)) is necessarily equal to various other things. But I see no reason to believe it; it feels to me like an assertion pulled out of thin air.
Of course, I can’t really have any counterevidence, the way the claim is structured.
I mean, I’ve certainly had the experience of changing my mind about whether X makes the world better, even though observing X continues to make me equally happy—that is, the experience of having F(Wa+X) - F(Wa) change while H(O(Wa+X)) - H(O((Wa)) stays the same—which suggests to me that F() and H() are different functions… but you would presumably just say that I’m mistaken about one or both of those things. Which is certainly possible, I am far from incorrigible either about what makes me happy and I don’t entirely understand what I believe makes the world better.
I think I have to leave it there. You are asserting an identity that seems unjustified to me, and I have no compelling reason to believe that it’s true, but also no definitive grounds for declaring it false.
I believe you to be sincere when you say
but I can’t imagine experiencing that. If the utility of a function goes down, it seems my happiness from seeing that function must necessarily go down as well. This discrepancy causes me to believe there is a low-level difference between what you consider happiness and what I consider happiness, but I can’t explain mine any farther than I already have.
I don’t know how else to say it, but I don’t feel I’m actually making that assertion. I’m just saying: “By my understanding of hedony=H(x), awareness=O(x), and utility=F(x), I don’t see any possible situation where H(W) =/= F(O(W)). If they’re indistinguishable, wouldn’t it make sense to say they’re the same thing?”
Edit: formatting
I agree that if two things are indistinguishable in principle, it makes sense to use the same label for both.
It is not nearly as clear to me that “what makes me happy” and “what makes the world better” are indistinguishable sets as it seems to be to you, so I am not as comfortable using the same label for both sets as you seem to be.
You may be right that we don’t use “happiness” to refer to the same things. I’m not really sure how to explore that further; what I use “happiness” to refer to is an experiential state I don’t know how to convey more precisely without in effect simply listing synonyms. (And we’re getting perilously close to “what if what I call ‘red’ is what you call ‘green’?” territory, here.)
Without a much more precise way of describing patterns of neuron-fire, I don’t think either of us can describe happiness more than we have so far. Having discussed the reactions in-depth, though, I think we can reasonably conclude that, whatever they are, they’re not the same, which answers at least part of my initial question.
Thanks!
The subject in detail is too complicated to bother with in this comment thread because it is discussed in much greater detail elsewhere, so I’ll just bring up two things.
1) In the last month I’ve been thinking pretty darned carefully and am now really really unsure whether I’d accept the Superhappies’ deal and am frankly glad I’ll never have to make that choice.
2) Some of my own desires are bad, and if I were to take a pill that completely eliminated those desires, I would. The idea that what humanity wants right now is what it really wants is definitely not certain, as most certainly uncertain as uncertainties get. So the real question is, why does our utility function act the way it does? There was no purpose for it and if we can agree on a way to change it, we should change it, even if that means
go extinct.
Strongly agreed! But that’s why the gloss for CEV talks about stuff like what we would ideally want if we were smarter and knew more.
I don’t have any objection to you wireheading yourself. I do object to someone forcibly wireheading me.