[deleted] comments on Open Thread, August 2010-- part 2

[deleted] 27 Aug 2010 23:22 UTC
0 points
OK let me be a little more careful. The expected disutility the AI associates to a threat is

EU(threat) = P(threat will be carried out) x U(threat will be carried out) + P(threat will not be carried out) x U(threat will not be carried out)

I think that the existence of other muggers with bigger weapons, or just of other dangers and opportunities generally, is accounted for in the second summand.

Now does the formulation look OK to you?
- Pavitra 28 Aug 2010 1:29 UTC
  −2 points
  Parent
  That formulation seems to fail to distinguish (ransom paid)&(threat not carried out) from (ransom not paid)&(threat not carried out).
  
  There are two courses of actions being considered: pay ransom or don’t pay ransom.
  
  EU(pay ransom) = P(no later real threat) * U(sun safe) + P(later real threat) * U(sun explodes)
  
  EU(don’t pay ransom) = P(threat fake) * ( P(no later real threat) + P(later real threat) * P(later real threat correctly identified as real | later real threat) ) * U(sun safe) + ( P(threat real) + P(threat fake) * P(later real threat) P(later real threat incorrectly identified as fake | later real threat) ) \ U(sun explodes)
  
  That’s completely unreadable. I need symbolic abbreviations.
  
  R=EU(pay ransom); r=EU(don’t pay ransom)
  
  S=U(sun safe); s=U(sun explodes)
  
  T=P(threat real); t=P(threat fake)
  
  L=P(later real threat); M=P(no later real threat)
  
  i=P(later real threat correctly identified as real | later real threat)
  
  j=P(later real threat incorrectly identified as fake | later real threat)
  
  Then:
  
  R = M*S + L*s
  
  r = t*(M + L*i)*S + (T + t*L*j)*s
  
  (p.s.: We really need a preview feature.)
  - [deleted] 28 Aug 2010 2:38 UTC
    0 points
    Parent
    Why so much focus on future threats to the sun? Are you going to argue, by analogy with the prisoner’s dilemma, that the iterated Pascal’s mugging is easier to solve than the one-shot Pascal’s mugging?
    
    That formulation seems to fail to distinguish (ransom paid)&(threat not carried out) from (ransom not paid)&(threat not carried out).
    
    I thought that, either by definition or as a simplifying assumption, EU(ransom paid & threat not carried out) = current utility—size of ransom, and that EU(ransom not paid & threat not carried out) = current utility.
    - Pavitra 28 Aug 2010 3:19 UTC
      0 points
      Parent
      My primary thesis is that the iterated Pascal’s mugging is much more likely to approximate any given real-world situation than the one-shot Pascal’s mugging, and that focusing on the latter is likely to lead by availability heuristic bias to people making bad decisions on important issues.
      - [deleted] 28 Aug 2010 3:37 UTC
        2 points
        Parent
        My primary thesis is that if you have programmed a purported god-like and friendly AI that you know will do poorly in one-shot Pascal’s mugging, then you should not turn it on. Even if you know it will do well in other variations on Pascal’s mugging.
        
        My secondary thesis comes from Polya: “If there’s a problem that you can’t solve, then there’s a simpler problem that you can solve. Find it!” Solutions to, failed solutions to, and ideas about one-shot Pascal’s mugging will illuminate features about iterated Pascal’s mugging and also about many given real-world situations.
        
        (“One-shot”, “iterated”...If these are even good names!)
        Pavitra 28 Aug 2010 3:44 UTC
        0 points
        Parent
        I’m not persuaded that paying the ransom is doing poorly on the one-shot. And if it predictably does the wrong thing, in what sense is it Friendly?
        [deleted] 28 Aug 2010 3:52 UTC
        4 points
        Parent
        Forget it. I’m just weirded out that you would respond to “here’s a tentative formalization of a simple version of Pascal’s mugging” with “even thinking about it is dangerous.” I don’t agree and I don’t understand the mindset.
        Pavitra 28 Aug 2010 4:08 UTC
        1 point
        Parent
        I don’t mean to say that thinking about the one-shot is dangerous, only that grossly overemphasizing it relative to the iterated might be.
        
        I hear about the one-shot all the time, and the iterated not at all, and I think the iterated is more likely to come up than the one-shot, and I think the iterated is easier to solve than the one-shot, so in all I think it’s completely reasonable for me to want to emphasize the iterated.
        [deleted] 28 Aug 2010 4:15 UTC
        0 points
        Parent
        Granted! And
        
        I think the iterated is easier to solve than the one-shot
        
        tell me more.
        Pavitra 28 Aug 2010 4:22 UTC
        0 points
        Parent
        The iterated has an easy-to-accept-intuitively solution: don’t just randomly accept blackmail from anyone who offers it, but rather investigate first to see if they constitute a credible threat.
        
        The one-shot Pascal’s Mugging, like most one-shot games discussed in game theory, has a harder-to-stomach dominant strategy: pay the ransom, because the mere claim, considered as Bayesian evidence, promotes the threat to much more likely than the reciprocal of its utility-magnitude.