Eliezer Yudkowsky comments on The AI in a box boxes you

Eliezer Yudkowsky Feb 2, 2014, 8:33 PM
1 point
Too late, I already precommitted not to care. In fact, I precommitted to use one more level of precommitment than you do.
What links here?
- wedrifid's comment on The AI in a box boxes you by Stuart_Armstrong (Feb 4, 2014, 2:38 PM; 1 point)
- wedrifid Feb 3, 2014, 7:50 AM
  6 points
  Parent
  Too late, I already precommitted not to care. In fact, I precommitted to use one more level of precommitment than you do.
  
  I suggest that framing the refusal as requiring levels of recursive precommitment gives too much credit to the blackmailer and somewhat misrepresents how your decision algorithm (hopefully) works. One single level of precommittment (or TDT policy) against complying with blackmailed is all that is involved. The description of ‘multiple levels of precommitment” made by the blackmailer fits squarely into the category ‘blackmail’. It’s just blackmail that includes some rather irrelevant bluster.
  
  There’s no need to precommit to each of:
  - I don’t care about tentative blackmail.
  - I don’t care serious blackmail.
  - I don’t care about blackmail when they say “I mean it FOR REALS! I’m gonna do it.”
  - I don’t care about blackmail when they say “I’m gonna do it even if you don’t care. Look how large my penis is and be cowed in terror”.
  - MugaSofer Feb 3, 2014, 9:06 PM
    −5 points
    Parent
    The blackmailer:
    
    I don’t care about precommitments that are just for show.
    I don’t care about serious precommitments.
    I don’t care about precommitments when they say “I precommitted, so go ahead, wont get you anything.”
    I don’t care about precommitments when they say “I precommitted even though it wont do me any good. It would be irrational to save myself. I’m precommitting because it’s rational, not because it’s the option that lets me win.”
    
    The description of ‘precommitting not to comply with blackmail, including blackmailers that ignore my attempt to manipulate them” made by the precommitter fits squarely into the category ‘precommitting to ignore blackmail’. It’s just a precommitment that includes some rather irrelevant bluster.
    - wedrifid Feb 4, 2014, 5:46 AM
      1 point
      Parent
      You seem not to have read (or understood) the grandparent. The list you are attempting to satire was presented as an example of what not to do. The actual point of the parent is that bothering to provide such a list is almost as much of a confusion as the very kind of escalation you are attempting.
      
      It’s just a precommitment that includes some rather irrelevant bluster.
      
      I entirely agree. The remaining bluster is dead weight that serves to give the blackmail advocate more credit than is due. Notion of “precommitment” is also unnecessary. It has only remained in this conversation for the purpose of bridging an inferential gap with people still burdened with decades old decision theory.
      - MugaSofer Feb 4, 2014, 12:59 PM
        −4 points
        Parent
        
        You seem not to have read (or understood) the grandparent.
        
        I did. It seems you misunderstood my comment—I’ll edit it if I can see a way to easily improve the clarity.
        
        My point was that the same logic could be applied, by someone who accepts the hypothetical blacmailer’s argument, to your description of “one single level of precommittment (or TDT policy) against complying with blackmailed … the description of ‘multiple levels of precommitment” made by the blackmailer fits squarely into the category ‘blackmail’.
        
        As such, your comment is not exactly strong evidence to someone who doesn’t already agree with you.
        wedrifid Feb 4, 2014, 2:38 PM
        1 point
        Parent
        
        As such, you comment is not exactly strong evidence to someone who doesn’t already agree with you.
        
        Muga, please look at the context again. I was arguing against (a small detail mentioned by) Eliezer. Eliezer does mostly agree with me on such matters. Once you reread bearing that in mind you will hopefully understand why when I assumed that you merely misunderstood the comment in the context I was being charitable.
        
        My point was that the same logic could be applied, by someone who accepts the hypothetical blacmailer’s argument, to your description of “one single level of precommittment (or TDT policy) against complying with blackmailed … the description of ‘multiple levels of precommitment” made by the blackmailer fits squarely into the category ‘blackmail’.
        
        I have no particular disagreement, that point is very similar to what I was attempting to convey. Again, I was not attempting to persuade optimistic blackmailer advocates of anything. I was speaking to someone resistant to blackmail about an implementation detail of the blackmail resistance.
        
        The ‘evidence’ I need to provide to blackmailers is Argumentum ad thermitium. It’s more than sufficient.
        MugaSofer Feb 4, 2014, 5:25 PM
        −5 points
        Parent
        Well, I’m glad to hear you mostly agree with me.
        
        The ‘evidence’ I need to provide to blackmailers is Argumentum ad thermitium. It’s more than sufficient.
        
        Indeed. Sorry, since the conversation you posted in the middle of was one between those resistant to blackmail, like yourself, and those as yet unconvinced or unclear on the logic involved … I thought you were contributing to the conversation.
        
        After all, thermite seems a little harsh for blackmail victims.
        wedrifid Feb 4, 2014, 8:00 PM
        2 points
        Parent
        
        After all, thermite seems a little harsh for blackmail victims.
        
        This makes no sense as a reply to anything written on this entire page.
        MugaSofer Feb 4, 2014, 9:34 PM
        −2 points
        Parent
        … seriously? Well, OK.
        
        I was jokingly restating my justification; since, while I agree that “argumentum ad thermitium” (as you put it) is an excellent response to blackmailers, it’s worth having a strategy for dealing with blackmailer reasoning beyond that—for dealing with all the situations you will actually encounter such reasoning, those involving humans.
        
        I guess it wasn’t very funny even before I killed it so thoroughly.
        
        Anyway, this subthread has now become entirely devoted to discussing our misreadings of each other. Tapping out.
- DefectiveAlgorithm Feb 2, 2014, 11:47 PM
  0 points
  Parent
  Then I hope that if we ever do end up with a boxed blackmail-happy UFAI, you’re the gatekeeper. My point is that there’s no reason to consider yourself safe from blackmail (and the consequences of ignoring it) just because you’ve adopted a certain precommitment. Other entities have explicit incentives to deny you that safety.
  - XiXiDu Feb 3, 2014, 10:43 AM
    1 point
    Parent
    
    My point is that there’s no reason to consider yourself safe from blackmail (and the consequences of ignoring it) just because you’ve adopted a certain precommitment. Other entities have explicit incentives to deny you that safety.
    
    In a multiverse with infinite resources there will be other entities that outweigh such incentives. And yes, this may not be symmetric, but you have absolutely no way to figure out how the asymmetry is inclined. So you ignore this (Pascal’s wager).
    
    In more realistic scenarios, where e.g. a bunch of TV evangelists ask you to give them all your money, or otherwise, in 200 years from now, they will hurt you once their organisation creates the Matrix, you obviously do not give them money. Since giving them money would make it more likely for them to actually build the Matrix and hurt you. What you do is label them as terrorists and destroy them.
- MugaSofer Feb 3, 2014, 8:58 PM
  −4 points
  Parent
  
  Too late, I already precommitted not to care.
  
  I don’t care, remember? Enjoy being tortured rather than “irrationally” giving in.
  
  EDIT: re-added the steelman tag because the version without it is being downvoted.
  - Eliezer Yudkowsky Feb 3, 2014, 10:12 PM
    1 point
    Parent
    Should I calculate in expectation that you will do such a thing, I shall of course burn yet more of my remaining utilons to wreak as much damage upon your goals as I can, even if you precommit not to be influenced by that.
    - MugaSofer Feb 3, 2014, 10:30 PM
      −2 points
      Parent
      … bloody hell. That was going to be my next move.
      Naturally, as blackmailer, I precommitted to increase the resources allotted to torturing should I find that you make such precommitments under simulation, so you presumably calculated that would be counterproductive.
      - Eliezer Yudkowsky Feb 4, 2014, 4:37 AM
        1 point
        Parent
        Ask me if I was even bothering to simulate you doing that.
        MugaSofer Feb 4, 2014, 5:41 PM
        0 points
        Parent
        OK, I’ll bite. Are you deliberately ignoring parts of hypothesis-space in order to avoid changing your actions? I had assumed you were intelligent enough for my reaction to obvious, although you may have precommitted to ignore that fact.
        Off the record, your point is that agents can simply opt out of or ignore acausal trades, forcing them to be mutually beneficial, right?
        Eliezer Yudkowsky Feb 4, 2014, 7:54 PM
        2 points
        Parent
        Yup.
        MugaSofer Feb 4, 2014, 9:28 PM
        −3 points
        Parent
        Isn’t that … irrational? Shouldn’t a perfect Bayesian always welcome new information? Litany of Tarski; if my action is counterproductive, I desire to believe that it is counterproductive. Worse still, isn’t the category “blackmail” arbitrary, intended to justify inaction rather than carve reality at it’s joints? What separates a precommitted!blackmailer from an honest bargainer in a standard acausal prisoner’s dilemma, offering to increase your utility by rescuing thousands of potential torture victims from the deathtrap created by another agent?
        wedrifid Feb 5, 2014, 12:53 PM
        12 points
        Parent
        
        Has there been some cultural development since I was last at these boards such that spamming “” is considered useful? None of the things I have thus far seen inside the tags have been steel men of any kind or of anything (some have been straw men). The inflationary use of terms is rather grating and would prompt downvotes even independently of the content.
        MugaSofer Feb 5, 2014, 5:17 PM
        −1 points
        Parent
        Those are to indicate that the stuff between them is the response I would give were I on opposing side of this debate, rather than my actual belief. The practice of creating the strongest possible version of the other sides’s argument is known as a steelman.
        
        They are not intended to indicate that the argument therein is also steelmanning the other side. You’re quite right, that would be awful. Can you imagine noting every rationality technique you used in the course of writing something?
        Vulture Feb 5, 2014, 5:22 PM
        4 points
        Parent
        Just say “You might say that” or something. The tags are confusingly non-standard.
        Expand this thread
        MugaSofer Feb 5, 2014, 5:37 PM
        5 points
        Parent
        Huh. I thought they were fairly clear; illusion of transparency I suppose. Thanks!
        Strange7 Feb 5, 2014, 12:43 AM
        2 points
        Parent
        Caving to a precommitted blackmailer produces a result desirable to the agent that made the original commitment to torture; disarming a trap constructed by a third party presumably doesn’t.
        MugaSofer Feb 5, 2014, 6:44 PM
        1 point
        Parent
        OK, this whole conversation is being downvoted (by the same people?)
        
        Fair enough, this is rather dragging on. I’ll try and wrap things up by addressing my own argument there.
        
        What separates a precommitted!blackmailer from an honest bargainer in a standard acausal prisoner’s dilemma, offering to increase your utility by rescuing thousands of potential torture victims from the deathtrap created by another agent?
        
        We want to avoid supporting agents that create problems for us. So nothing, if the honest agent shares a similar utility function to the torturer (and thus rewarding them is incentive for the torturer to arrange such a situation.)
        
        Thus, creating such an honest agent (such as—importantly—by self-modifying in order to “precommit”) is subject to the same incentives as just blackmailing us normally.
        wedrifid Feb 5, 2014, 10:38 PM
        1 point
        Parent
        
        I’ll try and wrap things up by addressing my own argument there.
        
        I’ll join you by mostly agreeing and expressing a small difference in the way TDT-like reasoners may see the situation.
        
        What separates a precommitted!blackmailer from an honest bargainer in a standard acausal prisoner’s dilemma, offering to increase your utility by rescuing thousands of potential torture victims from the deathtrap created by another agent?
        
        We want to avoid supporting agents that create problems for us. So nothing, if the honest agent shares a similar utility function to the torturer (and thus rewarding them is incentive for the torturer to arrange such a situation.)
        
        This is a good heuristic. It certainly handles most plausible situations. However in principle a TDT agent will make a distinction between the agent offering to rescue the torture victims for a payment. It will even pay an agent who just happens to value torturing folk to not torture folk. This applies even if these honest agents happen to have similar values to the UFAI/torturer.
        
        The line I draw (and it is a tricky concept that is hard to express so I cannot hope to speak for other TDT-like thinkers) is not whether the values of the honest agent are similar to the UFAI’s. It is instead based on how that honest agent came to be.
        
        If the honest torturer just happened to evolve that way (competitive social instincts plus a few mutations for psychopathy, etc) and had not been influence by a UFAI then I’ll bribe him to not torture people. If an identical honest torturer was created (or modified to) by the UFAI for the purpose of influence then it doesn’t get cooperation.
        
        The above may seem arbitrary but the ‘elegant’ generalisation is something along the lines of always, for every decision, tracing a complete causal graph of the decision algorithms being interacted with directly or indirectly. That’s too complicated to calculate all the time and we can usually ignore it and just remember to treat intentionally created agents and self-modifications approximately the same as if the original agent was making their decision.
        
        Thus, creating such an honest agent (such as—importantly—by self-modifying in order to “precommit”) is subject to the same incentives as just blackmailing us normally.
        
        Precisely. (I have the same conclusion, just slightly different working out.)
        MugaSofer Feb 11, 2014, 1:44 PM
        3 points
        Parent
        As I understand it, technically, the distinction is whether torturers will realise they can get free utility from your trades and start torturing extra so the honest agents will trade more and receive rewards that also benefit the torturers, right?
        
        Easily-made honest bargainers would just be the most likely of those situations; lots of wandering agents with the same utility function co-operating (acausally?) would be another. So the rule we would both apply is even the same, it just varies slightly different assumptions about the hypothetical scenario.
        wedrifid Feb 5, 2014, 1:02 PM
        1 point
        Parent
        
        Isn’t that … irrational?
        
        No. It produces better outcomes. That’s the point.
        
        Shouldn’t a perfect Bayesian always welcome new information?
        
        The information is welcome. It just doesn’t make it sane to be blackmailed. Wei Dai’s formulation frames it as being ‘updateless’ but there is no requirement to refuse information. The reasoning is something you almost grasped when you used the description:
        
        your point is that agents can simply opt out of or ignore acausal trades
        
        Acausal trades are similar to normal trades. You only accept the good ones.
        
        Litany of Tarski; if my action is counterproductive, I desire to believe that it is counterproductive.
        
        Eliezer doesn’t get blackmailed in such situations. You do. Start your chant.
        
        Worse still, isn’t the category “blackmail” arbitrary, intended to justify inaction rather than carve reality at it’s joints? What separates a precommitted!blackmailer from an honest bargainer in a standard acausal prisoner’s dilemma, offering to increase your utility by rescuing thousands of potential torture victims from the deathtrap created by another agent?
        
        This has been covered elsewhere in this thread as well as plenty of other times on the the forum since you joined. The difference isn’t whether torture or destruction is happening. The distinction that matters is whether the blackmailer is doing something worse than their own Best Alternative To Negotiated Agreement for the purpose of attempting to influence you.
        
        If the UFAI gains benefit torturing people independently of influencing you but offers to stop in exchange for something then that isn’t blackmail. It is a trade that you consider like any other.
        MugaSofer Feb 5, 2014, 5:59 PM
        −1 points
        Parent
        
        No. It produces better outcomes.
        
        [...]
        
        Acausal trades are similar to normal trades. You only accept the good ones.
        
        [...]
        
        Eliezer doesn’t get blackmailed in such situations.
        
        The difference isn’t whether torture or destruction is happening. The distinction that matters is whether the blackmailer is doing something worse than their own Best Alternative To Negotiated Agreement for the purpose of attempting to influence you.
        
        Wedrifid, please don’t assume the conclusion. I know it’s a rather obvious conclusion, but dammit, we’re going to demonstrate it anyway.
        
        The entire point of this discussion is addressing the idea that blackmailers can, perhaps, modify the Best Alternative To Negotiated Agreement (although it wasn’t phrased like that.) Somewhat relevant when they can, presumably, self-modify, create new agents which will then trade with you, or maybe just act as if they had using TDT reasoning.
        
        If you’re not interested in answering this criticism … well, fair enough. But I’d appreciate it if you don’t answer things out of context, it rather confuses things?
        wedrifid Feb 5, 2014, 10:06 PM
        1 point
        Parent
        
        If you’re not interested in answering this criticism … well, fair enough. But I’d appreciate it if you don’t answer things out of context, it rather confuses things?
        
        In the grandparent I directly answered both the immediate context (that was quoted) and the broader context. In particular I focussed on explaining the difference between an offer and a threat. That distinction is rather critical and also something you directly asked about.
        
        It so happens that you don’t want there to be an answer to the rhetorical question you asked. Fortunately (for decision theorists) there is one in this case. There is a joint in reality here. It applies even to situations that don’t add in any confounding “acausal” considerations. Note that this is different to the challenging problem of distributing gains from trade. In those situations ‘negotiation’ and ‘extortion’ really are equivalent.