I must say that I’m not completely sure in what is the correct answer to the problem. It is a question of the ultimate source of morality: should you do something because your past self would want you to do so? You judge for yourself, who is your past self to literally dictate your actions? If he didn’t precommit, you are free to do as you will. The past self’s decision is no more natural than your decision to give up the $100, as it may follow from the similar considerations, rooted in an even earlier game, a perspective on preference order drawn from the assumption of even more counterfactual paths that didn’t follow.
At the same time, you are not just determining your action, you are determining what sort of person you are, how you make your decisions, and that goes deeper than the specific action-disagreement.
What if you are a consequence of decisions and morality of some ancestor removed from you by 50000 generations, should you start acting according to his preference order, following from his different psychology? I don’t think so. You may judge your current morality to be a mistake, one that you want to correct, using your knowledge of your past self, but if you decide not to, who else is to judge?
I made a few implicit assumptions when I was writing it that I was not then aware of; these assumptions go directly to what you speak of, so I’ll sketch them out.
The ‘me’ I speak of in the article is not actually me. It is an idealised ‘me’ that happens to also be perfectly rational. Suppose that my motive is to determine what such an idealised version will do, and then do it. To your question “should you do something because your past self would want you to do so?” the “me” of the article can only reply “if that is a value represented in my then-current utility function.” Now I, myself, have to actually introspect and determine if that’s a value I hold, but my idealised self just knows. If it expects to be faced with Omega situations, and it doesn’t represent such a value, then my article proves that ideal-Nathan will modify itself such that it does represent such a value at the time it decides whether or not to pay Omega. Therefore I should try at all costs to hold such a value when I decide as well, right?
That’s the difficulty. Do I really want to hold a super-general principle that covers all Newcomblike situations, and to keep that principle whatever may happen to me in future? Such a principle would mean that my future self would actually feel better, in the end, about killing 15 people because Omega “would have” given me the FAI recipe in other circumstances, than he would if he did not kill those people. Do I want to feel better when that happens? I don’t think I do. But if I’m correct about that, then I must have made an error in my previous reasoning. After all, by assumption, ideal-Nathan has the same values as me; it just thinks better.
Where I think I went wrong is in assuming that the modification, (i.e. the action a_p in my article) has no cost. I think that what I was really trying to say in my article is that taking on a Newcomb-busting value system can itself have a very high cost to one’s current self, and it is worth considering very carefully if one is willing to pay that cost.
The answer is perhaps that even though you’d want to meta-precommit to preserving the will of your current self on the decisions of your future self, you (as a human) actually can’t do that. You can model what your decision should’ve been, had you been precommited by your past self, but this ideal decision is not what you actually want. This is just an abstract belief planted in your mind by the process that constructed it, from past to the future, but it’s not your true preference anymore, no more than the “survival of the fittest” is the utmost concern to our kind. You may in fact believe in this decision, but you are now wrong from your new perspective, and you should probably change your mind on reflection.
and you should probably change your mind on reflection.
Ideal-Nathan would not want to do so. It may seem completely irrational, but if paperclippers can not-want not to paperclip, then ideal-Nathan can not-want not to kill 15 people for no particularly consequential reason. Your reply
but this ideal decision is not what you actually want.
is true—it really is true—but it is true becauseI cannot with current technology radically alter myself to make it false. Ideal-Nathan can and does—unless it puts a strong disutility on such an action, which means that I myself put a strong disutility on such an action. Which I do.
[...] I cannot with current technology radically alter myself [...] Ideal-Nathan can and does—unless it puts a strong disutility on such an action, which means that I myself put a strong disutility on such an action.
That’s a mistake: you are not him. You make your own decisions. If you value following the ideal-self-modifying-you, that’s fine, but I don’t believe that’s in human nature, it’s only a declarative construction that doesn’t actually relate to your values. You may want to become the ideal-you, but that doesn’t mean that you want to follow the counterfactual actions of the ideal-you if you haven’t actually become one.
The ideal-potentially-self modifying me. No such being exists. I know, for a fact, that I am not perfectly rational in the sense that I construe “rational” to mean. That doesn’t mean that Omega couldn’t write a utility function that, if maximised, would perfectly describe my actions. Now in fact I am going to end up maximising that utility function: that’s just mathematics/physics. But I am structured so as to value “me”, even if “me” is just a concept I hold of myself. When I talk of ideal-Nathan, I mean a being that has the utility function that I think I have, which is not the same as the utility function that I do have. I then work out what ideal-Nathan does. If I find it does something that I know for a fact I do not want to do, then I’m simply mistaken about ideal-Nathan—I’m mistaken about my own utility function. That means that by considering the behaviour of ideal-Nathan (not looking so ideal now, is he?) I can occasionally discover something about myself. In this case I’ve discovered:
I don’t care about my past selves nearly as much as I thought I did
I place a stronger premium on not modifying myself in such a way as to find killing pleasurable than I do on human life itself.
I must say that I’m not completely sure in what is the correct answer to the problem. It is a question of the ultimate source of morality: should you do something because your past self would want you to do so? You judge for yourself, who is your past self to literally dictate your actions? If he didn’t precommit, you are free to do as you will. The past self’s decision is no more natural than your decision to give up the $100, as it may follow from the similar considerations, rooted in an even earlier game, a perspective on preference order drawn from the assumption of even more counterfactual paths that didn’t follow.
At the same time, you are not just determining your action, you are determining what sort of person you are, how you make your decisions, and that goes deeper than the specific action-disagreement.
What if you are a consequence of decisions and morality of some ancestor removed from you by 50000 generations, should you start acting according to his preference order, following from his different psychology? I don’t think so. You may judge your current morality to be a mistake, one that you want to correct, using your knowledge of your past self, but if you decide not to, who else is to judge?
Thanks for commenting on my article.
I made a few implicit assumptions when I was writing it that I was not then aware of; these assumptions go directly to what you speak of, so I’ll sketch them out.
The ‘me’ I speak of in the article is not actually me. It is an idealised ‘me’ that happens to also be perfectly rational. Suppose that my motive is to determine what such an idealised version will do, and then do it. To your question “should you do something because your past self would want you to do so?” the “me” of the article can only reply “if that is a value represented in my then-current utility function.” Now I, myself, have to actually introspect and determine if that’s a value I hold, but my idealised self just knows. If it expects to be faced with Omega situations, and it doesn’t represent such a value, then my article proves that ideal-Nathan will modify itself such that it does represent such a value at the time it decides whether or not to pay Omega. Therefore I should try at all costs to hold such a value when I decide as well, right?
That’s the difficulty. Do I really want to hold a super-general principle that covers all Newcomblike situations, and to keep that principle whatever may happen to me in future? Such a principle would mean that my future self would actually feel better, in the end, about killing 15 people because Omega “would have” given me the FAI recipe in other circumstances, than he would if he did not kill those people. Do I want to feel better when that happens? I don’t think I do. But if I’m correct about that, then I must have made an error in my previous reasoning. After all, by assumption, ideal-Nathan has the same values as me; it just thinks better.
Where I think I went wrong is in assuming that the modification, (i.e. the action a_p in my article) has no cost. I think that what I was really trying to say in my article is that taking on a Newcomb-busting value system can itself have a very high cost to one’s current self, and it is worth considering very carefully if one is willing to pay that cost.
The answer is perhaps that even though you’d want to meta-precommit to preserving the will of your current self on the decisions of your future self, you (as a human) actually can’t do that. You can model what your decision should’ve been, had you been precommited by your past self, but this ideal decision is not what you actually want. This is just an abstract belief planted in your mind by the process that constructed it, from past to the future, but it’s not your true preference anymore, no more than the “survival of the fittest” is the utmost concern to our kind. You may in fact believe in this decision, but you are now wrong from your new perspective, and you should probably change your mind on reflection.
Ideal-Nathan would not want to do so. It may seem completely irrational, but if paperclippers can not-want not to paperclip, then ideal-Nathan can not-want not to kill 15 people for no particularly consequential reason. Your reply
is true—it really is true—but it is true because I cannot with current technology radically alter myself to make it false. Ideal-Nathan can and does—unless it puts a strong disutility on such an action, which means that I myself put a strong disutility on such an action. Which I do.
That’s a mistake: you are not him. You make your own decisions. If you value following the ideal-self-modifying-you, that’s fine, but I don’t believe that’s in human nature, it’s only a declarative construction that doesn’t actually relate to your values. You may want to become the ideal-you, but that doesn’t mean that you want to follow the counterfactual actions of the ideal-you if you haven’t actually become one.
The ideal-potentially-self modifying me. No such being exists. I know, for a fact, that I am not perfectly rational in the sense that I construe “rational” to mean. That doesn’t mean that Omega couldn’t write a utility function that, if maximised, would perfectly describe my actions. Now in fact I am going to end up maximising that utility function: that’s just mathematics/physics. But I am structured so as to value “me”, even if “me” is just a concept I hold of myself. When I talk of ideal-Nathan, I mean a being that has the utility function that I think I have, which is not the same as the utility function that I do have. I then work out what ideal-Nathan does. If I find it does something that I know for a fact I do not want to do, then I’m simply mistaken about ideal-Nathan—I’m mistaken about my own utility function. That means that by considering the behaviour of ideal-Nathan (not looking so ideal now, is he?) I can occasionally discover something about myself. In this case I’ve discovered:
I don’t care about my past selves nearly as much as I thought I did
I place a stronger premium on not modifying myself in such a way as to find killing pleasurable than I do on human life itself.