Stuart_Armstrong comments on “Solving” selfishness for UDT

Stuart_Armstrong 28 Oct 2014 11:37 UTC
2 points

Also, why should you only value people who closely resemble you if you don’t exist?

There’s no “should”—this is a value set. This is the extension of the classical selfish utility idea. Suppose that future you joins some silly religion and does some stupid stuff and so on (insert some preferences of which you disprove here). Most humans would still consider that person “them” and would (possibly grudgingly) do things to make them happy. But now imagine that you were duplicated, and the other duplicate went on and did things you approved of more. Many people would conclude that the second duplicate was their “true” self, and redirect all their efforts towards them.

This is very close to Nozick’s “closer continuer” approach http://www.iep.utm.edu/nozick/#H4 .

However, the bigger issue that you haven’t covered is this: if there are multiple entities in the same world to which you do (or potentially could) assign the label “me”, how do you assign utility to that world?

It seems the simplest extension of classical selfishness is that the utility function assigns preferences to the physical being that it happens to reside in. This allows it to assign preferences immediately, without first having to figure out their location. But see my answer to the next question (the real issue is that our normal intuitions break down in these situations, making any choice somewhat arbitrary).

Nor do I think that the “adding” approach is equivalent to your notion of “copy-altruism”, because under the “adding” approach you would stop caring about your copies once you figured out which one you were

UDT (or CDT with precommitments) forces selfish agents who don’t know who they are into behaving the same as copy-altruists. Copy altruism and adding/averaging come apart under naive CDT. (Note that for averaging versus adding, the difference can only be detected by comparing with other universes with different numbers of people.)

The halfer is only being strange because they seem to be using naive CDT. You could construct a similar paradox for a thirder if you assume the ticket pays out only for the other copy, not themselves.
- lackofcheese 28 Oct 2014 23:17 UTC
  1 point
  Parent
  
  There’s no “should”—this is a value set.
  
  The “should” comes in giving an argument for why a human rather than just a hypothetically constructed agent might actually reason in that way. The “closest continuer” approach makes at least some intuitive sense, though, so I guess that’s a fair justification.
  
  The halfer is only being strange because they seem to be using naive CDT. You could construct a similar paradox for a thirder if you assume the ticket pays out only for the other copy, not themselves.
  
  I think there’s more to it than that. Yes, UDT-like reasoning gives a general answer, but under UDT the halfer is still definitely acting strange in a way that the thirder would not be.
  
  If the ticket pays out for the other copy, then UDT-like reasoning would lead you to buy the ticket regardless of whether you know which one you are or not, simply on the basis of having a linked decision. Here’s Jack’s reasoning:
  
  “Now that I know I’m Jack, I’m still only going to pay at most $0.50, because that’s what I precommited to do when I didn’t know who I was. However, I can’t help but think that I was somehow stupid when I made that precommitment, because now it really seems I ought to be willing to pay ²⁄₃. Under UDT sometimes this kind of thing makes sense, because sometimes I have to give up utility so that my counterfactual self can make greater gains, but it seems to me that that isn’t the case here. In a counterfactual scenario where I turned out to be Roger and not Jack, I would still desire the same linked decision (x=2/3). Why, then, am I stuck refusing tickets at 55 cents?”
  
  It appears to me that something has clearly gone wrong with the self-averaging approach here, and I think it is indicative of a deeper problem with SSA-like reasoning. I’m not saying you can’t reasonably come to the halfer conclusion for different reasons (e.g. the “closest continuer” argument), but some or many of the possible reasons can still be wrong. That being said, I think I tend to disagree with pretty much all of the reasons one could be a halfer, including average utilitarianism, the “closest continuer”, and selfish averaging.
  - Stuart_Armstrong 29 Oct 2014 8:10 UTC
    2 points
    Parent
    
    simply on the basis of having a linked decision.
    
    Linked decisions is also what makes the halfer paradox go away.
    
    To get a paradox that hits at the “thirder” position specifically, in the same way as yours did, I think you need only replace the ticket with something mutually beneficial—like putting on an enjoyable movie that both can watch. Then the thirder would double count the benefit of this, before finding out who they were.
    - lackofcheese 29 Oct 2014 14:04 UTC
      1 point
      Parent
      
      Linked decisions is also what makes the halfer paradox go away.
      
      I don’t think linked decisions make the halfer paradox I brought up go away. Any counterintuitive decisions you make under UDT are simply ones that lead to you making a gain in a counterfactual possible worlds at the cost of a loss in actual possible worlds. However, in the instance above you’re losing both in the real scenario in which you’re Jack, and in the counterfactual one in which you turned out to be Roger.
      
      Granted, the “halfer” paradox I raised is an argument against having a specific kind of indexical utility function (selfish utility w/ averaging over subjectively indistinguishable agents) rather than an argument against being a halfer in general. SSA, for example, would tell you to stick to your guns because you would still assign probability ¹⁄₂ even after you know whether you’re “Jack” or “Roger”, and thus doesn’t suffer from the same paradox. That said, due to the reference class problem, If you are told whether you’re Jack or Roger before being told everything else SSA would give the wrong answer, so it’s not like it’s any better...
      
      To get a paradox that hits at the “thirder” position specifically, in the same way as yours did, I think you need only replace the ticket with something mutually beneficial—like putting on an enjoyable movie that both can watch. Then the thirder would double count the benefit of this, before finding out who they were.
      
      Are you sure? It doesn’t seem to be that this would be paradoxical; since the decisions are linked you could argue that “If I hadn’t put on an enjoyable movie for Jack/Roger, Jack/Roger wouldn’t have put on an enjoyable movie for me, and thus I would be worse off”. If, on the other hand, only one agent gets to make that decision, then the agent-parts would have ceased to be subjectively indistinguishable as soon as one of them was offered the decision.
      - Stuart_Armstrong 29 Oct 2014 17:10 UTC
        2 points
        Parent
        Did I make a mistake? It’s possible—I’m exhausted currently. Let’s go through this carefully. Can you spell out exactly why you think that halfers are such that:
        
        They are only willing to pay ¹⁄₂ for a ticket.
        They know that they must either be Jack or Roger.
        They know that upon finding out which one they are, regardless of whether it’s Jack or Roger, they would be willing to pay ²⁄₃.
        
        I can see 1) and 2), but, thinking about it, I fail to see 3).
        lackofcheese 29 Oct 2014 19:46 UTC
        1 point
        Parent
        As I mentioned earlier, it’s not an argument against halfers in general; it’s against halfers with a specific kind of utility function, which sounds like this: “In any possible world I value only my own current and future subjective happiness, averaged over all of the subjectively indistinguishable people who could equally be “me” right now.”
        
        In the above scenario, there is a ¹⁄₂ chance that both Jack and Roger will be created, a ¹⁄₄ chance of only Jack, and a ¹⁄₄ chance of only Roger.
        
        Before finding out who you are, averaging would lead to a 1:1 odds ratio, and so (as you’ve agreed) this would lead to a cutoff of ¹⁄₂.
        
        After finding out whether you are, in fact, Jack or Roger, you have only one possible self in the TAILS world, and one possible self in the relevant HEADS+Jack/HEADS+Roger world, which leads to a 2:1 odds ratio and a cutoff of ²⁄₃.
        
        Ultimately, I guess the essence here is that this kind of utility function is equivalent to a failure to properly conditionalise, and thus even though you’re not using probabilities you’re still “Dutch-bookable” with respect to your own utility function.
        
        I guess it could be argued that this result is somewhat trivial, but the utility function mentioned above is at least intuitively reasonable, so I don’t think it’s meaningless to show that having that kind of utility function is going to put you in trouble.
        Stuart_Armstrong 30 Oct 2014 10:01 UTC
        1 point
        Parent
        
        “In any possible world I value only my own current and future subjective happiness, averaged over all of the subjectively indistinguishable people who could equally be “me” right now.”
        
        Oh. I see. The problem is that that utility takes a “halfer” position on combining utility (averaging) and “thirder” position on counterfactual worlds where the agent doesn’t exist (removing them from consideration). I’m not even sure it’s a valid utility function—it seems to mix utility and probability.
        
        For example, in the heads world, it values “50% Roger vs 50% Jack” at the full utility amount, yet values only one of “Roger” and “Jack” at full utility. The correct way of doing this would be to value “50% Roger vs 50% Jack” at 50% - and then you just have a rescaled version of the thirder utility.
        
        I think I see the idea you’re getting at, but I suspect that the real lesson of your example is that that mixed halfer/thirder idea cannot be made coherent in terms of utilities over worlds.
        lackofcheese 31 Oct 2014 14:25 UTC
        1 point
        Parent
        I don’t think that’s entirely correct; SSA, for example, is a halfer position and it does exclude worlds where you don’t exist, as do many other anthropic approaches.
        
        Personally I’m generally skeptical of averaging over agents in any utility function.
        Stuart_Armstrong 4 Nov 2014 10:23 UTC
        1 point
        Parent
        
        SSA, for example, is
        
        Which is why I don’t use anthropic probability, because it leads to these kinds of absurdities. The halfer position is defined in the top post (as is the thirder), and your setup uses aspects of both approaches. If it’s incoherent, then SSA is incoherent, which I have no problem with. SSA != halfer.
        Stuart_Armstrong 3 Nov 2014 17:06 UTC
        1 point
        Parent
        Averaging makes a lot of sense if the number of agents is going to be increased and decreased in non-relevant ways.
        
        Eg: you are an upload. Soon, you are going to experience eating a chocolate bar, then stubbing your toe, then playing a tough but intriguing game. During this time, you will be simulated on n computers, all running exactly the same program of you experiencing this, without any deviations. But n may vary from moment to moment. Should you be willing to pay to make n higher during pleasant experience or lower during unpleasant ones, given that you will never detect this change?
        lackofcheese 4 Nov 2014 0:06 UTC
        1 point
        Parent
        I think there are some rather significant assumptions underlying the idea that they are “non-relevant”. At the very least, if the agents were distinguishable, I think you should indeed be willing to pay to make n higher. On the other hand, if they’re indistinguishable then it’s a more difficult question, but the anthropic averaging I suggested in my previous comments leads to absurd results.
        
        What’s your proposal here?
        Stuart_Armstrong 4 Nov 2014 10:21 UTC
        1 point
        Parent
        
        the anthropic averaging I suggested in my previous comments leads to absurd results.
        
        The anthropic averaging leads to absurd results only because it wasn’t a utility function over states of the world. Under heads, it ranked 50%Roger+50%Jack differently from the average utility of those two worlds.