NihilCredo comments on Another attempt to explain UDT

NihilCredo 14 Nov 2010 17:18 UTC
2 points

Note that updating on the knowledge that you are in tails-universe (because Omega showed up) doesn’t affect anything, because the theory is “updateless”.

Can you expand a little on this?
- cousin_it 14 Nov 2010 17:20 UTC
  4 points
  Parent
  Under UDT you don’t even notice the fact that you “are” in tails-universe. You only care that there are two universes, with weights that have been “unchanging” since the beginning of time, and that your decision has certain logical implications in both of them. Then you inspect the sum of utility*weight and see that it’s optimal to pay up.
  - NihilCredo 14 Nov 2010 17:25 UTC
    2 points
    Parent
    Wait, you said:
    
    Until you learn which it is, you think it’s both. You’re all your copies at once.
    
    But in the CM example, you did learn which it is. I am confused.
    - cousin_it 14 Nov 2010 17:35 UTC
      2 points
      Parent
      The CM example contains two “logical consequences” of your current state—two places that logically depend on your current decision, and so are “glued together” decision-theoretically—but the other “consequence” is not the you in heads-universe, which is occupying a different information state. It’s whatever determines Omega’s decision whether to give you money in heads-universe. It may be a simulation of you in tails-universe, or any other computation that provably returns the same answer, UDT doesn’t care.
  - Vaniver 14 Nov 2010 17:25 UTC
    2 points
    Parent
    This seems like it’s tailored to solve reputation problems but causes problems in others (i.e. almost everything). Propagating ignorance forward seems unwise, even if we can set up hypotheticals where it is. It looks like the sunk costs fallacy becomes a gaping hole, and if you’re in a casino, you want to notice whether you’re in the heads universe or tails universe.
    - JGWeissman 14 Nov 2010 18:00 UTC
      4 points
      Parent
      
      but causes problems in others (i.e. almost everything)
      
      If you are going to make this sort of claim, which the people you are trying to convince clearly disagree with, you should automatically include at least one example.
      
      Propagating ignorance forward seems unwise
      
      UDT does not propagate ignorance. Instead of using evidence to build knowledge of a single universe, it uses that evidence to identify what effects a decision has, possibly in multiple universes.
      - Vaniver 14 Nov 2010 18:24 UTC
        6 points
        Parent
        
        If you are going to make this sort of claim, which the people you are trying to convince clearly disagree with, you should automatically include at least one example.
        
        Ah, I thought the mention of the sunk costs fallacy or the casino were sufficient as examples.
        
        If I’m at a casino in front of a blackjack table, I first make the decision whether or not to sit down, then if I do how much to bet, then I see my cards, then I choose my response. I don’t see how UDT adds value when it comes to making any of those decisions, and it seems detrimental when making the last one (I don’t need to be thinking about what I drew in other universes).
        
        For the problems where it does add value- dealing with paradoxes where you need to not betray people because that’s higher payoff than betraying them- it seems like an overly complex solution to a simple problem (care about your reputation). Essentially, it sounds to me a lot like “Odin made physics”- it sounds like a rationalization that adds complexity without adding value.
        
        UDT does not propagate ignorance. Instead of using evidence to build knowledge of a single universe, it uses that evidence to identify what effects a decision has, possibly in multiple universes.
        
        What’s the difference between this and “thinking ahead”? The only difference I see is it also suggests that you think behind, which puts you at risk for the sunk costs fallacy. In a few edge cases, that’s beneficial- the Omega paradoxes are designed to reward sunk cost thinking. But in real life, that sort of thinking is fallacious. If someone offers to sell you a lottery ticket, and you know that ticket is not a winner, you should not buy it on the hopes that they would have offered you the same choice if the ticket was a winner.
        JGWeissman 14 Nov 2010 18:54 UTC
        2 points
        Parent
        An example in this case would be actually describing a situation where an agent has to make a decision based on specified available information, and an analysis of what decision UDT and whatever decision theory you would like to compare it to make, and what happens to agents that make those decisions.
        
        Essentially, it sounds to me a lot like “Odin made physics”- it sounds like a rationalization that adds complexity without adding value.
        
        It is more like: relativity accurately describes things that go fast, and agrees with Newtonian physics about things that go slow like we are used to.
        
        sunk costs fallacy
        
        The sunk cost fallacy is caring more about making a previous investment payoff than getting the best payoff on your current decision. Where is the previous investment in counterfactual mugging?
        Vaniver 14 Nov 2010 21:56 UTC
        1 point
        Parent
        I don’t have a proper response for you, but this came from thinking about your comments and you may be interested in it.
        
        At the moment, I can’t wrap my head around what it actually means to do math with UDT. If it’s truly updateless, then it’s worthless because a decision theory that ignores evidence is terrible. If it updates in a bizarre fashion, I’m not sure how that’s different from updating normally. It seems like UDT is designed specifically to do well on these sorts of problems, but I think that’s a horrible criterion (as explained in the linked post), and I don’t see it behaving differently from simple second-order game theory. It’s different from first-order game theory, but that’s not its competitor.
        cousin_it 14 Nov 2010 22:29 UTC
        0 points
        Parent
        
        and it seems detrimental when making the last one (I don’t need to be thinking about what I drew in other universes).
        
        UDT doesn’t ask you to think about what you drew in the other universes, because presumably the decisions you’d have made with different cards aren’t a logical consequence of the decision you make with your current cards. So you still end up maximizing the sum of utility*weight over all universes using the original non-updated weights, but the terms corresponding to the other universes happen to be constant, so you only look at the term corresponding to the current universe. UDT doesn’t add value here, but nor does it harm; it actually agrees with CDT in most non-weird situation, like your casino example. UDT is a generalization of CDT to the extreme cases where causal intuition fails—it doesn’t throw away the good parts.
        
        Overall, it seems that my attempt at communication in the original post has failed. Oh well.
        Vaniver 14 Nov 2010 22:39 UTC
        −2 points
        Parent
        
        but nor does it harm
        
        Only if it’s costless to check that your decisions in this universe don’t actually impact the other universes. UDT seems useful as a visualization technique in a few problems, but I don’t think that’s sufficient to give it a separate name (intended as speculation, not pronouncement).
        
        Overall, it seems that my attempt at communication in the original post has failed. Oh well.
        
        Well, it was worth a shot. I think the main confusion on my end, which I think I’ve worked through, is that UDT is designed for problems I don’t believe can exist- and so the well is pretty solidly poisoned there.
        cousin_it 14 Nov 2010 22:58 UTC
        2 points
        Parent
        UDT is supposed to be about fundamental math, not efficient algorithms. It’s supposed to define what value we ought to optimize, in a way that hopefully accords with some of our intuitions. Before trying to build approximate computations, we ought to understand the ideal we’re trying to approximate in the first place. Real numbers as infinite binary expansions are pretty impractical for computation too, but it pays to get the definition right.
        
        Whether UDT is useful in reality is another question entirely. I’ve had a draft post for quite a while now titled “Taking UDT Seriously”, featuring such shining examples as: it pays to retaliate against bullies even at the cost of great harm to yourself, because anticipation of such retaliation makes bullies refrain from attacking counterfactual versions of you. Of course the actual mechanism by which bullies pick victims is different and entirely causal—maybe some sort of pheromones indicating willingness to retaliate—but it’s still instructive how an intuition from the platonic math of UDT unexpectedly transfers to the real world. There may be a lesson here.
        Vaniver 14 Nov 2010 23:57 UTC
        0 points
        Parent
        That draft would be interesting to see completed, and it may help me see what UDT brings to the table. I find the idea of helping future me and other people in my world far more compelling than the idea of helping mes that don’t exist in my world- and so if I can come to the conclusion “stand up to bullies at high personal cost because doing so benefits you and others in the medium and long term,” I don’t see a need for nonexistent mes, and if I don’t think it’s worth it on the previously stated grounds, I don’t see the consideration of nonexistent mes changing my mind.
        
        Again, that can be a potent visualization technique, by imagining a host of situations to move away from casuistry towards principles or to increase your weighting of your future circumstances or other’s circumstances. I’m not clear on how a good visualization technique makes for an ideal, though.
      - Vladimir_Nesov 14 Nov 2010 23:44 UTC
        2 points
        Parent
        
        UDT does not propagate ignorance. Instead of using evidence to build knowledge of a single universe, it uses that evidence to identify what effects a decision has, possibly in multiple universes.
        
        Not quite (or maybe you mean the same thing). Observations construct new agents that are able to affect different places in the environment than the original agent. Observations don’t constitute new knowledge, they constitute a change in the decision problem, replacing the agent part with a modified one.
        JGWeissman 15 Nov 2010 0:10 UTC
        0 points
        Parent
        
        (or maybe you mean the same thing)
        
        Yes, or at least I agree your explanation.