Stuart_Armstrong comments on “Solving” selfishness for UDT

Stuart_Armstrong 28 Oct 2014 20:18 UTC
2 points
I think it’s worth sorting the issue out (if you agree), so let’s go slowly. Both SSA and SIA depend on priors, so you can’t argue for them based on maximal entropy grounds. If the coin is biased, they will have different probabilities (so SSA+biased coin can have the same probabilities as SIA+unbiased coin and vice versa). That’s probably obvious to you, but I’m mentioning it in case there’s a disagreement.

Your model works, with a few tweaks. SSA starts with a probability distribution over worlds, throws away the ones where “you” don’t exist (why? shush, don’t ask questions!), and then locates themselves within the worlds by subdividing a somewhat arbitrary reference class. SIA starts with the same, uses the original probabilities to weigh every possible copy of themselves, sees these as separate events, and then renormalises (which is sometimes impossible, see http://lesswrong.com/lw/fg7/sia_fears_expected_infinity/).

I have to disagree with your conversation, however. Both SIA and SSA consider all statements of type “I exist in universe X and am the person in location Y” to be mutually exclusive and exhaustive. It’s just that SIA stratifies by location only (and then deduces the probability of a universe by combining different locations in the same universe), while SSA first stratifies by universe and then by location.

But I still think this leads us astray. My point is different. Normally, given someone’s utility, it’s possible to disentangle whether someone is using a particular decision theory or a particular probability approach by observing their decisions. However, in anthropic (and Psy-Koch-like) situations, this becomes impossible. In the notation that I used in the paper I referred to, SIA+”divided responsibility” will always give the same decision as SSA+”total responsibility” (to a somewhat more arguable extent, for any fixed responsibility criteria, EDT+SSA gives the same decisions as CDT+SIA).

Since the decision is the same, this means that all the powerful arguments for using probability (which boil down to “if you don’t act as if you have consistent probabilities, you’ll lose utility pointlessly”) don’t apply in distinguishing between SIA and SSA. Thus we are not forced to have a theory of anthropic probability—it’s a matter of taste whether to do so or not. Nothing hinges on whether the probability of heads is “really” ¹⁄₃ or ¹⁄₂. The full decision theory is what counts, not just the anthropic probability component.
- Manfred 29 Oct 2014 5:25 UTC
  4 points
  Parent
  
  Both SSA and SIA depend on priors, so you can’t argue for them based on maximal entropy grounds. If the coin is biased, they will have different probabilities (so SSA+biased coin can have the same probabilities as SIA+unbiased coin and vice versa).
  
  I definitely agree that SSA + belief in a biased coin can have the same probabilities as SIA + belief in an unbiased coin. (I’m just calling them beliefs to reinforce that the thing that affects the probability directly is the belief, not that coin itself). But I think you’re making an implied argument here—check if I’m right.
  
  The implied argument would go like “because the biasedness of the coin is a prior, you can’t say what the probabilities will be just from the information, because you can always change the prior.”
  
  The short answer is that the probabilities I calculated are simply for agents who “assume SSA” and “assume SIA” and have no other information.
  
  The long answer is to explain how this interacts with priors. By the way, have you re-read the first three chapters of Jaynes recently? I have done so several times, and found it helpful.
  
  Prior probabilities still reflect a state of information. Specifically, they reflect one’s aptly named prior information. Then you learn something new, and you update, and now your probabilities are posterior probabilities and reflect your posterior information. Agents with different priors have different states of prior information.
  
  Perhaps there was an implied argument that there’s some problem with the fact that two states with different information (SSA+unbiased and SIA+biased) are giving the same probabilities for events relevant to the problem? Well, there’s no problem. If we conserve information there must be differences somewhere, but they don’t have to be in the probabilities used in decision-making.
  
  a few tweaks.
  
  Predictably, I’d prefer descriptions in terms of probability theory to mechanistic descriptions of how to get the results.
  
  I have to disagree with your conversation, however. Both SIA and SSA consider all statements of type “I exist in universe X and am the person in location Y” to be mutually exclusive and exhaustive. It’s just that SIA stratifies by location only (and then deduces the probability of a universe by combining different locations in the same universe), while SSA first stratifies by universe and then by location.
  
  Whoops. Good point, I got SSA quite wrong. Hm. That’s troubling. I think I made this mistake way back in the ambitious yet confused post I mentioned, and have been lugging it around ever since.
  
  Consider an analogous game where a coin is flipped. If heads I get a white marble. If tails, somehow (so that this ‘somehow’ has a label, let’s call it ‘luck’) I get either a white marble or a black marble. This is SSA with different labels. How does one get the probabilities from a specification like the one I gave for SIA in the sleeping beauty problem?
  
  I think it’s a causal condition, possibly because of something equivalent to “the coin flip does not affect what day it is.” And I’m bad at doing this translation.
  
  But I need to think a lot, so I’ll get back to you later.
  
  Since the decision is the same, this means that all the powerful arguments for using probability
  
  Just not a fan of Cox’s theorem, eh?
  - Stuart_Armstrong 29 Oct 2014 7:53 UTC
    1 point
    Parent
    
    “assume SSA” and “assume SIA”
    
    And I’m still not seeing what that either assumption gives you, if your decision is already determined (by UDT, for instance) in a way that makes the assumption irrelevant.
    
    Just not a fan of Cox’s theorem, eh?
    
    Very much a fan. Anything that’s probability-like needs to be an actual probability. I’m disputing whether anthropic probabilities are meaningful at all.
    - Manfred 29 Oct 2014 13:58 UTC
      2 points
      Parent
      
      And I’m still not seeing what that either assumption gives you, if your decision is already determined
      
      I’ll delay talking about the point of all of this until later.
      
      whether anthropic probabilities are meaningful at all.
      
      Probabilities are a function that represents what we know about events (where “events” is a technical term meaning things we don’t control, in the context of Cox’s theorem—for different formulations of probability this can take on somewhat different meanings). This is “what they mean.”
      
      As I said to lackofcheese:
      
      Probabilities have a foundation independent of decision theory, as encoding beliefs about events. They’re what you really do expect to see when you look outside.
      
      This is an important note about the absent-minded driver problem et al, that can get lost if one gets comfortable in the effectiveness of UDT. The agent’s probabilities are still accurate, and still correspond to the frequency with which they see things (truly!) - but they’re no longer related to decision-making in quite the same way.
      
      “The use” is then to predict, as accurately as ever, what you’ll see when you look outside yourself.
      
      If you accept that the events you’re trying to predict are meaningful (e.g. “whether it’s Monday or Tuesday when you look outside”), and you know Cox’s theorem, then P(Monday) is meaningful, because it encodes your information about a meaningful event.
      
      In the Sleeping Beauty problem, the answer still happens to be straightforward in terms of logical probabilities, but step one is definitely agreeing that this is not a meaningless statement.
      
      (side note: If all your information is meaningless, that’s no problem—then it’s just like not knowing anything and it gets P=0.5)
      - Stuart_Armstrong 29 Oct 2014 16:49 UTC
        4 points
        Parent
        
        Probabilities are a function that represents what we know about events
        
        As I said to lackofcheese:
        
        If we create 10 identical copies of me and expose 9 of them one stimuli and 1 to another, what is my subjective anticipation of seeing one stimuli over the other? 10% is one obvious answer, but I might take a view of personal identity that fails to distinguish between identical copies of me, in which case 50% is correct. What if identical copies will be recombined later? Eliezer had a thought experiment where agents were two dimensional, and could get glued or separated from each other, and wondered whether this made any difference. I do to. And I’m also very confused about quantum measure, for similar reasons.
        
        In general, the question “how many copies are there” may not be answerable in certain weird situations (or can be answered only arbitrarily).
        
        EDIT: with copying and merging and similar, you get odd scenarios like “the probability of seeing something is x, the probability of remembering seeing it is y, the probability of remembering remembering it is z, and x y and z are all different.” Objectively it’s clear what’s going on, but in terms of “subjective anticipation”, it’s not clear at all.
        
        Or put more simply: there are two identical copies of you. They will be merged soon. Do you currently have a 50% chance of dying soon?
        Manfred 30 Oct 2014 14:41 UTC
        3 points
        Parent
        
        In general, the question “how many copies are there” may not be answerable in certain weird situations (or can be answered only arbitrarily).
        
        I agree with this. In probability terms, this is saying that P(there are 9 copies of me) is not necessarily meaningful because the event is not necessarily well defined.
        
        My first response is / was that the event “the internet says it’s Monday” seems a lot better-defined than “there are 9 of me,” and should therefore still have a meaningful probability, even in anthropic situations. But an example may be necessary here.
        
        I think you’d agree that a good example of “certain weird situations” is the divisible brain. Suppose we ran a mind on transistors and wires of macroscopic size. That is, we could make them half as big and they’d still run the same program. Then one can imagine splitting this mind down the middle into two half-sized copies. If this single amount of material counts as two people when split, does it also count as two people when it’s together?
        
        Whether it does or doesn’t is, to some extent, mere semantics. If we set up a Sleeping Beauty problem except that there’s the same amount of total width on both sides, it then becomes semantics whether there is equal anthropic probability on both sides, or unequal. So the “anthropic probabilities are meaningless” argument is looking pretty good. And if it’s okay to define amount of personhood based on thickness, why not define it however you like and make probability pointless?
        
        But I don’t think it’s quite as bad as all that, because of the restriction that your definition of personhood is part of how you view the world, not a free parameter. You don’t try to change your mind about the gravitational constant so that you can jump higher. So agents can have this highly arbitrary factor in what they expect to see, but still behave somewhat reasonably. (Of course, any time an agent has some arbitrary-seeming information, I’d like to ask “how do you know what you think you know?” Exploring the possibilities better in this case would be a bit of a rabbit hole, though.)
        
        Then, if I’m pretending to be Stuart Armstrong, I note that there’s an equivalence in the aforementioned equal-total-width sleeping beauty problem between e.g. agents who think that anthropic probability is proportional to total width but have the same payoffs in both worlds (“width-selfish agents”), and agents who ignore anthropic probability, but weight the payoffs to agents by their total widths, per total width (“width-average-utilitarian outside perspective [UDT] predictors”).
        
        Sure, these two different agents have different information/probabilities and different internal experience, but to the extent that we only care about the actions in this game, they’re the same.
        
        Even if an agent starts in multiple identical copies that then diverge into non-identical versions, a selfish agent will want to self-modify to be an average utilitarian between non-identical versions. But this is a bit different from the typical usage of “average utilitarianism” in population ethics. A population-ethics average utilitarian would feed one of their copies to hungry alligators if it paid of for the other copies. But a reflectively-selfish average utilitarian would expect some chance of being the one fed to the alligators, and wouldn’t like that plan at all.
        
        Actually, I think the cause of this departure from average utilitarianism over copies is the starting state. When you start already defined as one of multiple copies, like in the divisible brain case, the UDT agent that naive selfish agents want to self-modify to be no longer looks just like an average utilitarian.
        
        So that’s one caveat about this equivalence—that it might not apply to all problems, and to get these other problems right, the proper thing to do is to go back and derive the best strategy in terms of selfish preferences.
        
        Which is sort of the general closing thought I have: your arguments make a lot more sense to me than they did before, but as long as you have some preferences that are indexically selfish, there will be cases where you need to do anthropic reasoning just to go from the selfish preferences to the “outside perspective” payoffs that generate the same behavior. And it doesn’t particularly matter if you have some contrived state of information that tells you you’re one person on Mondays and ten people on Tuesdays.
        
        Man, I haven’t had a journey like this since DWFTTW. I was so sure that thing couldn’t be going downwind faster than the wind.
        
        P.S. So I have this written down somewhere, the causal buzzword important for an abstract description of the game with the marbles is “factorizable probability distribution.” I may check out a causality textbook and try and figure the application of this out with less handwaving, then write a post on it.
        IlyaShpitser 30 Oct 2014 15:21 UTC
        3 points
        Parent
        Hi, “factorization” is just taking a thing and expressing it as a product of simpler things. For example, a composite integer is a product of powers of primes.
        
        In probability theory, we get a simple factorization via the chain rule of probability. If we have independence, some things drop out, but factorization is basically intellectually content-free. Of course, I also think Bayes rule is an intellectually content-free consequence of the chain rule of probability. And of course this may be hindsight bias operating...
        
        You are welcome to message or email me if you want to talk about it more.
        Stuart_Armstrong 30 Oct 2014 15:08 UTC
        1 point
        Parent
        
        then write a post on it.
        
        That would be interesting.
        lackofcheese 29 Oct 2014 20:00 UTC
        1 point
        Parent
        You definitely don’t have a 50% chance of dying in the sense of “experiencing dying”. In the sense of “ceasing to exist” I guess you could argue for it, but I think that it’s much more reasonable to say that both past selves continue to exist as a single future self.
        
        Regardless, this stuff may be confusing, but it’s entirely conceivable that with the correct theory of personal identity we would have a single correct answer to each of these questions.
        Stuart_Armstrong 30 Oct 2014 9:39 UTC
        1 point
        Parent
        Conceivable. But it doesn’t seem to me that such a theory is necessary, as it’s role seems merely to be able to state probabilities that don’t influence actions.
- lackofcheese 29 Oct 2014 1:19 UTC
  1 point
  Parent
  I think that argument is highly suspect, primarily because I see no reason why a notion of “responsibility” should have any bearing on your decision theory. Decision theory is about achieving your goals, not avoiding blame for failing.
  
  However, even if we assume that we do include some notion of responsibility, I think that your argument is still incorrect. Consider this version of the incubator Sleeping Beauty problem, where two coins are flipped.
  HH ⇒ Sleeping Beauties created in Room 1, 2, and 3
  HT ⇒ Sleeping Beauty created in Room 1
  TH ⇒ Sleeping Beauty created in Room 2
  TT ⇒ Sleeping Beauty created in Room 3
  Moreover, in each room there is a sign. In Room 1 it is equally likely to say either “This is not Room 2” or “This is not Room 3″, and so on for each of the three rooms.
  
  Now, each Sleeping Beauty is offered a choice between two coupons; each coupon gives the specified amount to their preferred charity (by assumption, utility is proportional to $ given to charity), but only if each of them chose the same coupon. The payoff looks like this:
  A ⇒ $12 if HH, $0 otherwise.
  B ⇒ $6 if HH, $2.40 otherwise.
  
  I’m sure you see where this is going, but I’ll do the math anyway.
  
  With SIA+divided responsibility, we have
  p(HH) = p(not HH) = ¹⁄₂
  The responsibility is divided among 3 people in HH-world, and among 1 person otherwise, therefore
  EU(A) = (1/2)(1/3)$12 = $2.00
  EU(B) = (1/2)(1/3)$6 + (1/2)$2.40 = $2.20
  
  With SSA+total responsibility, we have
  p(HH) = ¹⁄₃
  p(not HH) = ²⁄₃
  EU(A) = (1/3)$12 = $4.00
  EU(B) = (1/3)$6 + (2/3)$2.40 = $3.60
  
  So SIA+divided responsibility suggests choosing B, but SSA+total responsibility suggests choosing A.
  - Stuart_Armstrong 29 Oct 2014 8:37 UTC
    2 points
    Parent
    The SSA probability of HH is ¹⁄₄, not ¹⁄₃.
    
    Proof: before opening their eyes, the SSA agents divide probability as: ¹⁄₁₂ HH1 (HH and they are in room 1), ¹⁄₁₂ HH2, ¹⁄₁₂ HH3, ¹⁄₄ HT, ¹⁄₄ TH, ¹⁄₄ TT.
    
    Upon seeing a sign saying “this is not room X”, they remove one possible agent from the HH world, and one possible world from the remaining three. So this gives odds of HH:¬HH of (1/12+1/12):(1/4+1/4) = 1/6:1/2, or 1:3, which is a probability of ¹⁄₄.
    
    This means that SSA+divided responsibility says EU(A) is $3, and EU(B) is $3.3. - exactly the same ratios as the first setup, with B as the best choice.
    - lackofcheese 29 Oct 2014 9:02 UTC
      1 point
      Parent
      That’s not true. The SSA agents are only told about the conditions of the experiment after they’re created and have already opened their eyes.
      
      Consequently, isn’t it equally valid for me to begin the SSA probability calculation with those two agents already excluded from my reference class?
      
      Doesn’t this mean that SSA probabilities are not uniquely defined given the same information, because they depend upon the order in which that information is incorporated?
      - Stuart_Armstrong 29 Oct 2014 9:27 UTC
        2 points
        Parent
        
        Doesn’t this mean that SSA probabilities are not uniquely defined given the same information, because they depend upon the order in which that information is incorporated?
        
        Yep. The old reference class problem. Which is why, back when I thought anthropic probabilities were meaningful, I was an SIAer.
        
        But SIA also has some issues with order of information, though it’s connected with decisions ( http://lesswrong.com/lw/4fl/dead_men_tell_tales_falling_out_of_love_with_sia/ ).
        
        Anyway, if your reference class consists of people who have seen “this is not room X”, then “divided responsibility” is no longer ¹⁄₃, and you probably have to go whole UTD.
        lackofcheese 29 Oct 2014 13:44 UTC
        1 point
        Parent
        
        But SIA also has some issues with order of information, though it’s connected with decisions
        
        Can you illustrate how the order of information matters there? As far as I can tell it doesn’t, and hence it’s just an issue with failing to consider counterfactual utility, which SIA ignores by default. It’s definitely a relevant criticism of using anthropic probabilities in your decisions, because failing to consider counterfactual utility results in dynamic inconsistency, but I don’t think it’s as strong as the associated criticism of SSA.
        
        Anyway, if your reference class consists of people who have seen “this is not room X”, then “divided responsibility” is no longer ¹⁄₃, and you probably have to go whole UTD.
        
        If divided responsibility is not ¹⁄₃, what do those words even mean? How can you claim that only two agents are responsible for the decision when it’s quite clear that the decision is a linked decision shared by three agents?
        
        If you’re taking “divided responsibility” to mean “divide by the number of agents used as an input to the SIA-probability of the relevant world”, then your argument that SSA+total = SIA+divided boils down to this: “If, in making decisions, you (an SIA agent) arbitrarily choose to divide your utility for a world by the number of subjectively indistinguishable agents in that world in the given state of information, then you end up with the same decisions as an SSA agent!”
        
        That argument is, of course, trivially true because the the number of agents you’re dividing by will be the ratio between the SIA odds and the SSA odds of that world. If you allow me to choose arbitrary constants to scale the utility of each possible world, then of course your decisions will not be fully specified by the probabilities, no matter what decision theory you happen to use. Besides, you haven’t even given me any reason why it makes any sense at all to measure my decisions in terms of “responsibility” rather than simply using my utility function in the first place.
        
        On the other hand, if, for example, you could justify why it would make sense to include a notion of “divided responsibility” in my decision theory, then that argument would tell me that SSA+total responsibility must clearly be conceptually the wrong way to do things because it uses total responsibility instead.
        
        All in all, I do think anthropic probabilities are suspect for use in a decision theory because
        
        They result in reflective inconsistency by failing to consider counterfactuals.
        It doesn’t make sense to use them for decisions when the probabilities could depend upon the decisions (as in the Absent-Minded Driver)
        
        That said, even if you can’t use those probabilities in your decision theory there is still a remaining question of “to what degree should I anticipate X, given my state of information”. I don’t think your argument on “divided responsibility” holds up, but even if it did the question on subjective anticipation remains unanswered.
        Stuart_Armstrong 29 Oct 2014 16:35 UTC
        2 points
        Parent
        
        “If, in making decisions, you (an SIA agent) arbitrarily choose to divide your utility for a world by the number of subjectively indistinguishable agents in that world in the given state of information, then you end up with the same decisions as an SSA agent!”
        
        Yes, that’s essentially it. However, the idea of divided responsibility has been proposed before (though not in those terms) - it’s not just a hack I made up. Basic idea is, if ten people need to vote unanimously “yes” for a policy that benefits them all, do they each consider that their vote made the difference between the policy and no policy, or that it contributed a tenth of that difference? Divided responsibility actually makes more intuitive sense in many ways, because we could replace the unanimity requirement with “you cause ¹⁄₁₀ of the policy to happen” and it’s hard to see what the difference is (assuming that everyone votes identically).
        
        But all these approaches (SIA and SSA and whatever concept of responsibility) fall apart when you consider that UDT allows you to reason about agents that will make the same decision as you, even if they’re not subjectively indistinguishable from you. Anthropic probability can’t deal with these—worse, it can’t even consider counterfactual universes where “you” don’t exist, and doesn’t distinguish well between identical copies of you that have access to distinct, non-decision relevant information.
        
        the question on subjective anticipation remains unanswered.
        
        Ah, subjective anticipation… That’s an interesting question. I often wonder whether it’s meaningful. If we create 10 identical copies of me and expose 9 of them one stimuli and 1 to another, what is my subjective anticipation of seeing one stimuli over the other? 10% is one obvious answer, but I might take a view of personal identity that fails to distinguish between identical copies of me, in which case 50% is correct. What if identical copies will be recombined later? Eliezer had a thought experiment where agents were two dimensional, and could get glued or separated from each other, and wondered whether this made any difference. I do to. And I’m also very confused about quantum measure, for similar reasons.
        lackofcheese 29 Oct 2014 19:55 UTC
        1 point
        Parent
        OK, the “you cause ¹⁄₁₀ of the policy to happen” argument is intuitively reasonable, but under that kind of argument divided responsibility has nothing to do with how many agents are subjectively indistinguishable and instead has to do with the agents who actually participate in the linked decision.
        
        On those grounds, “divided responsibility” would give the right answer in Psy-Kosh’s non-anthropic problem. However, this also means your argument that SIA+divided = SSA+total clearly fails, because of the example I just gave before, and because SSA+total gives the wrong answer in Psy-Kosh’s non-anthropic problem but SIA+divided does not.
        
        Ah, subjective anticipation… That’s an interesting question. I often wonder whether it’s meaningful.
        
        As do I. But, as Manfred has said, I don’t think that being confused about it is sufficient reason to believe it’s meaningless.
        Stuart_Armstrong 30 Oct 2014 9:44 UTC
        1 point
        Parent
        The divergence between reference class (of identical people) and reference class (of agents with the same decision) is why I advocate for ADT (which is essentially UDT in an anthropic setting).