It seems that in this post, by “selfish” you mean something like “not updateless” or “not caring about counterfactuals”. A meaning closer to usual sense of the word would be, “caring about welfare of a particular individual” (including counterfactual instances of that individual, etc.), which seems perfectly amenable to being packaged as a reflectively consistent agent (that is not the individual in question) with world-determined utility function.
(A reference to usage in Stuart’s paper maybe? I didn’t follow it.)
It seems that in this post, by “selfish” you mean something like “not updateless” or “not caring about counterfactuals”.
By “selfish” I mean how each human (apparently) cares about himself more than others, which needs an explanation because there can’t be a description of himself embedded in his brain at birth. “Not updateless” is meant to be a proposed explanation, not a definition of “selfish”.
A meaning closer to usual sense of the word would be, “caring about welfare of a particular individual” (including counterfactual instances of that individual, etc.), which seems perfectly amenable to being packaged as a reflectively consistent agent (that is not the individual in question) with world-determined utility function.
No, that’s not the meaning I had in mind.
(A reference to usage in Stuart’s paper maybe? I didn’t follow it.)
This post isn’t related to his paper, except that it made me think about selfishness and how it relates to AIXI and UDT.
By “selfish” I mean how each human (apparently) cares about himself more than others, which needs an explanation because there can’t be a description of himself embedded in his brain at birth.
Pointing at self is possible, which looks like a reasonable description of self, referring to all the details of a particular person. That is, interpretation of individual’s goal representation depends on the fact that the valued individual is collocated with the individual-as-agent.
Just as how a file offset value stored in memory of my computer won’t be referring to the same data if used on (moved to) your computer that has different files; its usefulness depends on the fact that it’s kept on the same computer; and it will continue to refer to same data if we move the whole computer around.
No, that’s not the meaning I had in mind.
Now I’m confused again, as I don’t see how these senses (one I suggested and one you explained in parent comment) differ, other than on the point of caring vs. not caring about counterfactual versions of same individual. You said, “each human (apparently) cares about himself more than others, which needs an explanation”, and it reads to me as asking how can humans have the individual-focused utility I suggested, that you then characterized as not the meaning you had in mind...
By “selfish” I mean how each human (apparently) cares about himself more than others, which needs an explanation because there can’t be a description of himself embedded in his brain at birth.
It’s hardly a mystery: you’re only plumbed into your own nervous system , so you only feel your own pleasures and pains. That creates a tendency to be only concerned about them as well. It’s more mysterious that you would be concerned about pains you can’t feel
Yes, I gave this explanation as #1 in the list in the OP, however as I tried to explain in the rest of the post, this explanation leads to other problems (that I don’t know how to solve).
Here’s a related objection that may be easier to see as a valid objection than counterfactual mugging: Suppose you’re about to be copied, then one of the copies will be given a choice, “A) 1 unit of pleasure to me, or B) 2 units of pleasure to the other copy.” An egoist (with perception-determined utility function) before being copied would prefer that their future self/copy choose B, but if that future self/copy is an egoist (with perception-determined utility function) it would choose A instead. So before being copied, the egoist would want to self-modify to become some other kind of agent.
I think there’s enough science on the subject—here’s the first paper I could find with a quick Google—to sketch out an approximate answer to the question of how self-care arises in an individual life. The infant first needs to form the concept of a person (what Bischof calls self-objectification), loosely speaking a being with both a body and a mind. This concept can be applied to both self and others. Then, depending on its level of emotional contagion (likelihood of feeling similarly to others when observing their emotions) it will learn, through sophisticated operant conditioning, self-concern and other-concern at different rates.
Since the typical human degree of emotional contagion is less than unity, we tend to be selfish to some degree. I’m using the word “selfish” just as you’ve indicated.
I expect Wei’s intuition is that knowing self means having an axiomatic definition of (something sufficiently similar to) self, so that it can be reasoned about for decision-theoretic purposes. But if we look at an axiomatic definition as merely some structure that is in known relation to the structure it defines, then your brain state in the past is just as good, and the latter can be observed in many ways, including through memory, accounts of own behavior, etc., and theoretically to any level of detail.
(Knowing self “at a low, instinctive level” doesn’t in itself meet the requirement of having access to a detailed description, but is sufficient to point to one.)
Do you mean that the agent itself must be the person it cares about? What if the agent is carried in a backpack (of the person in question), or works over the Internet?
What if the selfish agent that cares about itself writes an AI that cares about the agent, giving this AI more optimization power, since they share the same goal?
The usage in Stuart’s posts on here just meant a certain way of calculating expected utilities. Selfish agents only used their own future utility when calculating expected utility, unselfish agents mixed in other peoples’ utilities. To make this a bit more robust to redefinition of what’s in your utility function, we could say that a purely selfish agent’s expected utility doesn’t change if actions stay the same but other peoples’ utilities change.
No one can mix another person’s actual utility function into their own. You can mix in your estimate of it. You can mix in your estimate of what you think it should be. But the actual utility function of another person is in that other person, and not in you.
In general, you can have anything in your utility function you please. I could care about the number of ducks in the pond near where I grew up, even though I can’t see it. And when I say caring about the number of ducks in the pond, I don’t just mean my perception of it—I don’t want to maximize how many ducks I think are in the pond, or I would just drug myself. However, you’re right that when calculating an “expected utility,” that is, your best guess at the time, you don’t usually have perfect information about other peoples’ utility functions, just like I wouldn’t have perfect information about the number of ducks in the pond, and so would have to use an estimate.
The reason it worked without this distinction in Stuart’s articles on the sleeping beauty problem was because the “other people” were actually copies of Sleeping Beauty, so you knew that their utility functions were the same.
No one can mix another person’s actual utility function into their own.
You can mix a pointer to it into your own. To see that this is different from mixing it your estimate, consider what you would do if you found out your estimate was mistaken.
It seems that in this post, by “selfish” you mean something like “not updateless” or “not caring about counterfactuals”. A meaning closer to usual sense of the word would be, “caring about welfare of a particular individual” (including counterfactual instances of that individual, etc.), which seems perfectly amenable to being packaged as a reflectively consistent agent (that is not the individual in question) with world-determined utility function.
(A reference to usage in Stuart’s paper maybe? I didn’t follow it.)
By “selfish” I mean how each human (apparently) cares about himself more than others, which needs an explanation because there can’t be a description of himself embedded in his brain at birth. “Not updateless” is meant to be a proposed explanation, not a definition of “selfish”.
No, that’s not the meaning I had in mind.
This post isn’t related to his paper, except that it made me think about selfishness and how it relates to AIXI and UDT.
Pointing at self is possible, which looks like a reasonable description of self, referring to all the details of a particular person. That is, interpretation of individual’s goal representation depends on the fact that the valued individual is collocated with the individual-as-agent.
Just as how a file offset value stored in memory of my computer won’t be referring to the same data if used on (moved to) your computer that has different files; its usefulness depends on the fact that it’s kept on the same computer; and it will continue to refer to same data if we move the whole computer around.
Now I’m confused again, as I don’t see how these senses (one I suggested and one you explained in parent comment) differ, other than on the point of caring vs. not caring about counterfactual versions of same individual. You said, “each human (apparently) cares about himself more than others, which needs an explanation”, and it reads to me as asking how can humans have the individual-focused utility I suggested, that you then characterized as not the meaning you had in mind...
It’s hardly a mystery: you’re only plumbed into your own nervous system , so you only feel your own pleasures and pains. That creates a tendency to be only concerned about them as well. It’s more mysterious that you would be concerned about pains you can’t feel
Yes, I gave this explanation as #1 in the list in the OP, however as I tried to explain in the rest of the post, this explanation leads to other problems (that I don’t know how to solve).
There doesn’t seem to be a right answer to counterfactual mugging. Is it the only objection to #1?
Here’s a related objection that may be easier to see as a valid objection than counterfactual mugging: Suppose you’re about to be copied, then one of the copies will be given a choice, “A) 1 unit of pleasure to me, or B) 2 units of pleasure to the other copy.” An egoist (with perception-determined utility function) before being copied would prefer that their future self/copy choose B, but if that future self/copy is an egoist (with perception-determined utility function) it would choose A instead. So before being copied, the egoist would want to self-modify to become some other kind of agent.
I think there’s enough science on the subject—here’s the first paper I could find with a quick Google—to sketch out an approximate answer to the question of how self-care arises in an individual life. The infant first needs to form the concept of a person (what Bischof calls self-objectification), loosely speaking a being with both a body and a mind. This concept can be applied to both self and others. Then, depending on its level of emotional contagion (likelihood of feeling similarly to others when observing their emotions) it will learn, through sophisticated operant conditioning, self-concern and other-concern at different rates.
Since the typical human degree of emotional contagion is less than unity, we tend to be selfish to some degree. I’m using the word “selfish” just as you’ve indicated.
Why not, or what do you mean by this? Common sense suggests that we do know ourselves from others at a very low, instinctive level.
I expect Wei’s intuition is that knowing self means having an axiomatic definition of (something sufficiently similar to) self, so that it can be reasoned about for decision-theoretic purposes. But if we look at an axiomatic definition as merely some structure that is in known relation to the structure it defines, then your brain state in the past is just as good, and the latter can be observed in many ways, including through memory, accounts of own behavior, etc., and theoretically to any level of detail.
(Knowing self “at a low, instinctive level” doesn’t in itself meet the requirement of having access to a detailed description, but is sufficient to point to one.)
Just as altruism can be related to trust, selfishness can be related to distrust.
An agent which has a high prior belief in the existence of deceptive adversaries would exhibit “selfish” behaviors.
What is your meaning then? What would you call “caring about the welfare of a particular individual (that happens to be myself)”?
Ok, I do mean:
but I don’t mean:
(i.e., without the part in parenthesis) Does that clear it up?
Ah, there was a slight confusion on my part. So if I’m reading this correctly you define formally selfish to mean… selfish. :-)
Do you mean that the agent itself must be the person it cares about? What if the agent is carried in a backpack (of the person in question), or works over the Internet?
What if the selfish agent that cares about itself writes an AI that cares about the agent, giving this AI more optimization power, since they share the same goal?
The usage in Stuart’s posts on here just meant a certain way of calculating expected utilities. Selfish agents only used their own future utility when calculating expected utility, unselfish agents mixed in other peoples’ utilities. To make this a bit more robust to redefinition of what’s in your utility function, we could say that a purely selfish agent’s expected utility doesn’t change if actions stay the same but other peoples’ utilities change.
But this is all basically within option (2).
No one can mix another person’s actual utility function into their own. You can mix in your estimate of it. You can mix in your estimate of what you think it should be. But the actual utility function of another person is in that other person, and not in you.
Good point, if not totally right.
In general, you can have anything in your utility function you please. I could care about the number of ducks in the pond near where I grew up, even though I can’t see it. And when I say caring about the number of ducks in the pond, I don’t just mean my perception of it—I don’t want to maximize how many ducks I think are in the pond, or I would just drug myself. However, you’re right that when calculating an “expected utility,” that is, your best guess at the time, you don’t usually have perfect information about other peoples’ utility functions, just like I wouldn’t have perfect information about the number of ducks in the pond, and so would have to use an estimate.
The reason it worked without this distinction in Stuart’s articles on the sleeping beauty problem was because the “other people” were actually copies of Sleeping Beauty, so you knew that their utility functions were the same.
You can mix a pointer to it into your own. To see that this is different from mixing it your estimate, consider what you would do if you found out your estimate was mistaken.