Stuart_Armstrong comments on Change utility, reduce extortion

Stuart_Armstrong 1 May 2017 4:44 UTC
0 points

You will need to trust that estimate A LOT.

Not particularly. You can estimate the likely loss and likely gain from that utility change, as with anything. As long as you’re reasonably certain that the bottom parts of the utility function are more likely to be accessed through extortion than through other means, this is a rational thing to do. Absent a proper theory of extortion and attendant decision theory, of course.
- Dagon 1 May 2017 15:48 UTC
  0 points
  Parent
  
  As long as you’re reasonably certain that the bottom parts of the utility function are more likely to be accessed through extortion than through other means
  
  THIS is the key (along with some explanation of why you think extortion is different than some other interaction with different-valued entities). It’s massively counter to my intuitions—I think bottom parts of utility functions are extremely common in natural circumstances without blaming a cause that can be reasoned or traded with.
  - Stuart_Armstrong 1 May 2017 15:59 UTC
    0 points
    Parent
    Think of a total utilitarianism style approach, where you can take any small disutlility and multiply it again and again.
    - Dagon 1 May 2017 22:23 UTC
      0 points
      Parent
      OK. Why would this imply extortion rather than simple poverty?
      - Stuart_Armstrong 2 May 2017 4:16 UTC
        0 points
        Parent
        Because you’re the one creating the multiple instances of disutility, using a fraction of the resources of the cosmos.
        Dagon 2 May 2017 13:57 UTC
        0 points
        Parent
        Maybe more description of the scenario would help. Presumably there’s no infinity here—there’s a bound to the disutility (for you; presumably it’s utility for me) I can get with my fraction of the cosmos. What do you think the proper reaction of an FAI (or a human, for that matter) is, and why is it different for repeated small events than for one large event?
- Lumifer 1 May 2017 14:50 UTC
  0 points
  Parent
  
  You can estimate the likely loss and likely gain from that utility change, as with anything.
  
  You can try. Your estimate is likely to be very diffuse and uncertain—the issue is that you are trying to get a handle on the distribution tail and that is quite hard to do (see Taleb’s black swans, etc.)
  
  As long as you’re reasonably certain that the bottom parts of the utility function are more likely to be accessed through extortion than through other means, this is a rational thing to do
  
  Not at all—you’re forgetting the about the magnitude of consequences.
  
  Let’s say you have a blackmailer who wants a pony and she has the capability to meddle with your AI’s sensors. Lo and behold, she walks up to the AI and says “I want a pony! Look, there is a large incoming asteroid on a collision course with Earth. Gimme a pony and I’ll tell you if it’s real”.
  
  Ah, says you the designer. I estimate that the blackmailer is bluffing in 99% of the cases. That “bottom part of the utility function” (aka The Sweet Meteor Of Death) is much more likely to be accessed through extortion, a hundred times more likely, in fact.
  
  Therefore I will instruct the AI to disregard any data that tells it there an incoming asteroid on a collision course. And voila—the blackmailer doesn’t get a pony.
  
  What could possibly go wrong?
  - Stuart_Armstrong 1 May 2017 15:10 UTC
    0 points
    Parent
    The sweet meteor of death is well above the z point. Complete human extinction is above the z point.
    
    This hack is not intended to deal with normal extortion, it’s intended to cut off really bad outcomes.
    - Lumifer 1 May 2017 15:12 UTC
      2 points
      Parent
      
      it’s intended to cut off really bad outcomes
      
      What would these be? Can you give a couple of examples?
      
      Are you basically trying to escape Pascal’s Mugging?
      - Stuart_Armstrong 1 May 2017 15:57 UTC
        0 points
        Parent
        
        Are you basically trying to escape Pascal’s Mugging?
        
        The extortion version of that, yes.