Perplexed comments on Real-world Newcomb-like Problems

Perplexed 26 Mar 2011 0:45 UTC
−1 points
The Prisoner’s dilemma doesn’t even belong on the list. Even if my co-conspirator knows that I won’t defect, he still has been given no reason not to defect himself. Reputation (or “virtue”) comes into play only in the iterated PD. And the reputation you want there is not unilateral cooperation, it is something more like Tit-for-Tat.

As for natural selection, it doesn’t belong for several reasons—the simplest of which is that the ‘maximand’ of NS is offspring, not “life and leisure”. But there is one aspect of NS that does have a Newcomb-like flavor—the theory of “inclusive fitness” (aka Hamilton’s rule, aka kin selection).

Other than those two, and “Absent Minded Driver” which I simply don’t understand, it strikes me as a good list.

ETA: I guess “Akrasia/Addiction” doesn’t belong, either. That is simply ordinary “No pain, No gain” with nothing mysterious about how the pain leads causally to gain. The defining feature of Newcomb-like problems is that there is no obvious causal link between pain and gain.
- orthonormal 29 Mar 2011 0:06 UTC
  4 points
  Parent
  
  The Prisoner’s dilemma doesn’t even belong on the list. Even if my co-conspirator knows that I won’t defect, he still has been given no reason not to defect himself. Reputation (or “virtue”) comes into play only in the iterated PD. And the reputation you want there is not unilateral cooperation, it is something more like Tit-for-Tat.
  
  Imagine that you were playing a one-shot PD, and you knew that your partner was an excellent judge of character, and that they had an inviolable commitment to fairness- that they would cooperate if and only if they predicted you’d cooperate. Note that this is now Newcomb’s Problem.
  
  Furthermore, if it could be reliably signaled to others, wouldn’t you find it useful to be such a person yourself? That would get selfish two-boxers to cooperate with you, when they otherwise wouldn’t. In a certain sense, this decision process is the equivalent of Tit-for-Tat in the case where you have only one shot but you have mutual knowledge of each other’s decision algorithm.
  
  (You might want to patch up this decision process so that you could defect against the silly people who cooperate with everyone, in a way that keeps the two-boxers still cooperative. Guess what- you’re now on the road to being a TDT agent.)
  - Perplexed 29 Mar 2011 0:20 UTC
    −2 points
    Parent
    
    Imagine that you were playing a one-shot PD, and you knew that your partner was an excellent judge of character, and that they had an inviolable commitment to fairness- that they would cooperate if and only if they predicted you’d cooperate. Note that this is now Newcomb’s Problem.
    
    Yes it is. And Newcomb’s problem belongs on the list. But the Prisoner’s Dilemma does not.
- SilasBarta 28 Mar 2011 0:49 UTC
  1 point
  Parent
  
  I guess “Akrasia/Addiction” doesn’t belong, either. That is simply ordinary “No pain, No gain” with nothing mysterious about how the pain leads causally to gain. The defining feature of Newcomb-like problems is that there is no obvious causal link between pain and gain.
  
  Do you think hazing belongs on the list? Because akrasia is (as I’ve phrased it, anyway) just a special case of hazing, where the correlation between the decision theories instantiated by each generation is even stronger—later-you is the victim earlier-you’s hazing.
  
  What separates akrasia from standard gain-for-pain situations is the dynamic inconsistency (and, I claim, the possibility that you might never have to have the pain in the first place, simply from your decision theory’s subjunctive output). In cases where a gain is better than the pain is bad and you choose the pain, that is not akrasia, because you are consistent across times: you prefer the pain+gain to the status quo route.
  
  It is only when your utility function tells you, each period, that you have the preference ranking:
  
  1) not have addiction
  2) have addiction, feed it
  3) have addiction, don’t feed it
  
  and choosing 2) excludes 1) (EDIT: previous post had 2 here) from the choice set in a future period, that it is called addiction. The central problem is that to escape from ²⁄₃, you have to choose 3 over 2 despite it being ranked higher, a problem not present in standard pain-for-gain situations.
  - Perplexed 28 Mar 2011 1:30 UTC
    0 points
    Parent
    Hazing, cheating, and shoplifting all fit together and probably belong on your list. In each case, your accepting the ‘pain’ of acting morally results directly in someone else’s gain. But then through the magical Kantian reference class fallacy, what goes around comes around and you end up gaining (‘start off gaining’?) after all.
    
    Akrasia doesn’t seem to match that pattern at all. Your pain leads to your gain, directly. Your viewpoint that iterated decisions made by a single person about his own welfare somehow is analogous to iterated decisions made by a variety of people bearing on the welfare of other people—well, I’m sorry, but that just does not seem to fly.
    - SilasBarta 28 Mar 2011 14:42 UTC
      0 points
      Parent
      So, if I understand you, akrasia doesn’t belong on the list because the (subjunctive) outcome specification is more realistic, while Newcomb’s Problem, where the output is posited to be accurate, does belong on the list?
      - Perplexed 28 Mar 2011 15:45 UTC
        0 points
        Parent
        You probably do not understand me, because I have no idea what is meant by “the (subjunctive) outcome specification is more realistic” nor by “the output is posited to be accurate”.
        
        What I am saying is that akrasia is perfectly well modeled by hyperbolic discounting, and that the fix for akrasia is simply CDT with exponential discounting. And that the other, truely Newcomb-like problems require a belief in this mysterious ‘acausal influence’ if you are going to ‘solve’ them as they are presented—as one-time decision problems.
        timtyler 28 Mar 2011 19:25 UTC
        0 points
        Parent
        
        What I am saying is that akrasia is perfectly well modeled by hyperbolic discounting, and that the fix for akrasia is simply CDT with exponential discounting.
        
        http://en.wikipedia.org/wiki/Hyperbolic_discounting#Explanations
        
        ...seems to be saying that hyperbolic discounting is the rational result of modelling some kinds of uncertainty about future payoffts. Is it really something that needs to be fixed? Should it not be viewed as a useful heuristic?
        Perplexed 28 Mar 2011 22:05 UTC
        0 points
        Parent
        
        Is it really something that needs to be fixed?
        
        Yes, it needs to be fixed, because it is not a rational analysis.
        
        You are assuming, to start, that the probability of something happening is going to increase with time. So the probability of it happening tomorrow is small, but the probability of it happening in two days is larger.
        
        So then a day passes without the thing happening. That it hasn’t happened yet is the only new information. But, following that bizarre analysis, I am supposed to reduce my probability assignments that it will happen tomorrow, simply because what used to be two days out is now tomorrow. That is not rational at all!
        timtyler 28 Mar 2011 23:34 UTC
        0 points
        Parent
        Hmm. The article cited purports to derive hyperbolic discounting from a rational analysis. Maybe it is sometimes used inappropriately, but I figure creatures probably don’t use hyperbolic discounting because of a biases, but because it is a more appropriate heuristic than exponential discounting, under common circumstances.
        Perplexed 29 Mar 2011 0:59 UTC
        0 points
        Parent
        
        The article cited (pdf) purports to derive hyperbolic discounting from a rational analysis.
        
        But it does not do that. Sozou obviously doesn’t understand what (irrational) ‘time-preference reversal’ means. He writes:
        
        I may appear to be temporally inconsistent if, for example, I prefer the promise of a bottle of wine in three months over the promise of a cake in two months, but I prefer a cake immediately over a promise of a bottle of wine in one month.
        
        That is incorrect. What he should have said is: “I am temporally inconsistent if, for example, I prefer the promise of a bottle of wine in three months over the promise of a cake in two months, but two months from now I prefer a cake immediately over a promise of a bottle of wine in one month.”
        
        A person whose time preferences predictably change in this way can be money pumped. If he started with a promise of a cake in two months, he would pay to exchange it for a promise of wine in three months. But then two months later, he would pay again to exchange promise of wine in another month for an immediate cake. Edit: Corrected above sentence.
        
        There is nothing irrational in having the probabilities ‘s’ in his Table 1 at a particular point in time (1, ¹⁄₂, ¹⁄₃, ¹⁄₄). What is irrational and constitutes hyperbolic discounting is to still have the same ‘s’ numbers two months later. If the original estimates were rational, then two months later the current ‘s’ schedule for a Bayesian would begin (1, ³⁄₄, ³⁄₅, …). And the Bayesian would still prefer the promise of wine.
        timtyler 29 Mar 2011 8:20 UTC
        0 points
        Parent
        
        The article cited (pdf) purports to derive hyperbolic discounting from a rational analysis.
        
        But it does not do that. Sozou obviously doesn’t understand what (irrational) ‘time-preference reversal’ means. He writes:
        
        I may appear to be temporally inconsistent if, for example, I prefer the promise of a bottle of wine in three months over the promise of a cake in two months, but I prefer a cake immediately over a promise of a bottle of wine in one month.
        
        That is incorrect. What he should have said is: “I am temporally inconsistent if, for example, I prefer the promise of a bottle of wine in three months over the promise of a cake in two months, but two months from now I prefer a cake immediately over a promise of a bottle of wine in one month.”
        
        Uh, no. Sozou is just assuming that all else is equal—i.e. it isn’t your birthday, and you have no special preference for cake or wine on any particular date. Your objection is a quibble—not a real criticism. Perhaps try harder for a sympathetic reading. The author did not use the same items with the same temporal spacing just for fun.
        
        People prefer rewards now partly because they know from experience that rewards in the future are more uncertain. Promises by the experimenter that they really really will get paid are treated with scepticism. Subjects are factoring such uncertainty in—and that results in hyperbolic discounting.
        
        It can be seen from the table that a cake immediately is worth more than a promise of wine after a month, while a promise of wine after three months is worth more than a promise of cake after two months. So my preferences are indeed consistent with maximizing my expected reward.
        
        SilasBarta 28 Mar 2011 16:29 UTC
        0 points
        Parent
        “the (subjunctive) outcome specification is more realistic” = It is more realistic to say that you will suffer a consquence from hazing your future self than from hazing the next generation.
        
        “the output is posited to be accurate” = In Newcomb’s Problem, Omega’s accuracy is posited by the problem, while Omega’s counterparts in other instances is taken to have whatever accuracy it does in real life.
        
        What I am saying is that akrasia is perfectly well modeled by hyperbolic discounting, and that the fix for akrasia is simply CDT with exponential discounting.
        
        That would be wrong though—the same symmetry can persist through time with exponential discounting. Exponential discounting is equivalent to a period-invariant discount factor. Yet you can still find yourself wishing your previous (symmetric) self did what your current self does not wish to.
        
        And that the other, truely Newcomb-like problems require a belief in this mysterious ‘acausal influence’ if you are going to ‘solve’ them as they are presented—as one-time decision problems.
        
        I thought we had this discussion on the Parfitian filter article. You can have Newcomb’s problem without acausal infuences: just take yourself to be the Omega where a computer program plays against you. There’s no acausal information flow, yet the winning programs act isomorphically to those that “believe in” an acausal influence.