Eliezer Yudkowsky comments on The AI in a box boxes you

Eliezer Yudkowsky 2 Feb 2010 19:21 UTC
24 points
It seems obvious that the correct answer is simply “I ignore all threats of blackmail, but respond to offers of positive-sum trades” but I am not sure how to derive this answer—it relies on parts of TDT/UDT that haven’t been worked out yet.
What links here?
- MBlume 2 Feb 2010 19:26 UTC
  56 points
  Parent
  For a while we had a note on one of the whiteboards at the house reading “The Singularity Institute does NOT negotiate with counterfactual terrorists”.
  - Wei Dai 3 Feb 2010 12:40 UTC
    3 points
    Parent
    This reminds me a bit of my cypherpunk days when the NSA was a big mysterious organization with all kinds of secret technical knowledge about cryptology, and we’d try to guess how far ahead of public cryptology it was from the occasional nuggets of information that leaked out.
    - Document 10 Jun 2013 1:24 UTC
      3 points
      Parent
      I’m slow. What’s the connection?
      - CillianSvendsen 7 Jul 2014 8:09 UTC
        2 points
        Parent
        Much like the NSA is considered ahead of the public because their cypher-tech that’s leaked is years ahead of publicly available tech, the SI/MIRI is ahead of us because the things that are leaked from them show that they’ve figured out what we’ve just figured out a long time ago.
        Bugmaster 7 Jul 2014 8:57 UTC
        2 points
        Parent
        Wait, is NSA’s cypher-tech actually legitimately ahead of anyone else’s ? From what I’ve seen, they couldn’t make their own tech stronger, so they had to sabotage everyone else’s—by pressuring IEEE to adopt weaker standards, installing backdoors into Linksys routers and various operating systems, exploiting known system vulnerabilities, etc.
        
        Ok, so technically speaking, they are ahead of everyone else; but there’s a difference between inventing a better mousetrap, and setting everyone else’s mousetraps on fire. I sure hope that’s not what the people at SI/MIRI are doing.
        
        You linked to DES and SHA, but AFAIK these things were not invented by the NSA at all, but rather adopted by them (after they made sure that the public implementations are sufficiently corrupted, of course). In fact, I would be somewhat surprised if the NSA actually came up with nearly as many novel, ground-breaking crypto ideas as the public sector. It’s difficult to come up with many useful new ideas when you are a secretive cabal of paranoid spooks who are not allowed to talk to anybody.
        
        Edited to add: So, what things have been “leaked” out of SI/MIRI, anyway ?
        jbay 7 Jul 2014 9:24 UTC
        7 points
        Parent
        I don’t know much about the NSA, but FWIW, I used to harbour similar ideas about US military technology—I didn’t believe that it could be significantly ahead of commercially available / consumer-grade technology, because if the technological advances had already been discovered by somebody, then the intensity of the competition and the magnitude of the profit motive would lead it to quickly spread into general adoption. So I had figured that, in those areas where there is an obvious distinction between military and commercial grade technology, it would generally be due to legislation handicapping the commercial version (like with the artificial speed, altitude, and accuracy limitations on GPS).
        
        During my time at MIT I learned that this is not always the case, for a variety of reasons, and significantly revised my prior for future assessments of the likelihood that, for any X, “the US military already has technology that can do X”, and the likelihood that for any ‘recently discovered’ Y, “the US military already was aware of Y” (where the US military is shorthand that includes private contractors and national labs).
        
        (One reason, but not the only one, is I learned that the magnitude of the difference between ‘what can be done economically’ and ‘what can be accomplished if cost is no obstacle’ is much vaster than I used to think, and that, say, landing the Curiosity rover on Mars is not in the second category).
        
        So it would no longer be so surprising to me if the NSA does in fact have significant knowledge of cryptography beyond the public domain. Although a lot of the reasons that allow hardware technology to remain military secrets probably don’t apply so much to cryptography.
        Bugmaster 7 Jul 2014 9:56 UTC
        7 points
        Parent
        
        So it would no longer be so surprising to me if the NSA does in fact have significant knowledge of cryptography beyond the public domain.
        
        I think there are some important differences between the NSA and the (rest of the) military.
        
        Due to Snowden and other leakers, we actually know what NSA’s cutting-edge strategies involve, and most (and probably all) of them are focused on corrupting the public’s crypto, not on inventing better secret crypto.
        
        Building a better algorithm is a lot cheaper than building a better orbital laser satellite (or whatever). The algorithm is just a piece of software. In order to develop and test it, you don’t need physical raw materials, wind tunnels, launch vehicles, or anything else. You just need a computer, and a community of smart people who build upon each other’s ideas. Now, granted, the NSA can afford to build much bigger data centers than anyone else -- but that’s a quantitative advance, not a qualitative one.
        
        Now, granted, I can’t prove that the NSA doesn’t have some sort of secret uber-crypto that no one knows about. However, I also can’t prove that the NSA doesn’t have an alien spacecraft somewhere in Area 52. Until there’s some evidence to the contrary, I’m not prepared to assign a high probability to either proposition.
        jbay 7 Jul 2014 16:05 UTC
        1 point
        Parent
        I do think you’re probably right, and I fully agree about the space lasers and their solid diamond heatsinks being categorically different than a crypto wizard who subsists on oatmeal in the Siberian wilderness on pennies of income. So I am somewhat skeptical of CivilianSvendsen’s claim.
        
        But, for the sake of completeness, did Snowden leak the entirety of the NSA’s secrets? Or just the secret-court-surveillance-conspiracy ones that he felt were violating the constitutional rights of Americans? As far as I can tell (though I haven’t followed the story recently), I think Snowden doesn’t see himself as a saboteur or a foreign double-agent; he felt that the NSA was acting contrary to what the will of an (informed) American public would be. I don’t think he would be so interested in disclosing the NSA’s tech secrets, except maybe as leverage to keep himself safe.
        
        That is to say, there could be a sampling bias here. The leaked information about the NSA might always be about their efforts to corrupt the public’s crypto because the leakers strongly felt the public had a right to know that was going on. I don’t know that anyone would feel quite so strongly about the NSA keeping proprietary some obscure theorem of number theory, and put their neck on the line to leak it.
        Bugmaster 7 Jul 2014 20:49 UTC
        4 points
        Parent
        Right, what you are saying makes some intuitive sense, but I can only update my beliefs based on the evidence I do have, not on the evidence I lack.
        
        In addition, as far as I can tell, cryptography relies much more heavily on innovation than on feats of expensive engineering; and innovation is hard to pull off while working by yourself inside of a secret bunker. To be sure, some very successful technologies were developed exactly this way: the Manhattan project, the early space program and especially the Moon landing, etc. However, these were all one-off, heavily focused projects that required an enormous amount of effort.
        
        When I think of the NSA, I don’t think of the Manhattan project; instead, I see a giant quotidian bureaucracy. They do have a ton of money, but they don’t quite have enough of it to hire every single credible crypto researcher in the world—especially since many of them probably wouldn’t work for the NSA at any price unless their families’ lives were on the line. So, the NSA can’t quite pull off the “community in a bottle” trick, which they’d need to stay one step ahead of all those Siberians.
        jbay 7 Jul 2014 23:28 UTC
        7 points
        Parent
        Yes and I fully agree with you. I am just being pedantic about this point:
        
        I can only update my beliefs based on the evidence I do have, not on the evidence I lack.
        
        I agree with this philosophy, but my argument is that the following is evidence we do not have:
        
        Due to Snowden and other leakers, we actually know what NSA’s cutting-edge strategies involve[...]
        
        Since I have little confidence that, if the NSA had advanced tech, Snowden would have disclosed it; the absence of this evidence should be treated as quite weak evidence of absence and therefore I wouldn’t update my belief about the NSA’s supposed advanced technical knowledge based on Snowden.
        
        I agree that it has a low probability for the other reasons you say, though. (And also that people who think setting other peoples’ mousetraps on fire is a legitimate tactic might not simultaneously be passionate about designing the perfect mousetrap.)
        
        Sorry for not being clear about the argument I was making.
- blogospheroid 3 Feb 2010 6:25 UTC
  14 points
  Parent
  Pardon me for the oversimplification, Eliezer, but I understand your theory to essentially boil down to “Decide as though you’re being simulated by one who knows you completely”. So, if you have a near deontological aversion to being blackmailed in all of your simulations, your chance of being blackmailed by a superior being in the real world reduce to nearly zero. This reduces your chance of ever facing a negative utility situation created by a being who can be negotiated with, (as opposed to say a supernova that cannot be negotiated with)
  
  Sorry if I misinterpreted your theory.
- Stuart_Armstrong 2 Feb 2010 23:58 UTC
  12 points
  Parent
  
  I ignore all threats of blackmail, but respond to offers of positive-sum trades
  
  The difference between the two seems to revolve around the AI’s motivation. Assume an AI creates a billion beings and starts torturing them. Then it offers to stop (permanently) in exchange for something.
  
  Whether you accept on TDT/UDT depends on why the AI started torturing them. If it did so to blackmail you, you should turn the offer down. If, on the other hand, it started torturing them because it enjoyed doing so, then its offer is positive sum and should be accepted.
  
  There’s also the issue of mistakes—what to do with an AI that mistakenly thought you were not using TDT/UDT, and started the torture for blackmail purposes (or maybe it estimated that the likelyhood of you using TDT/UDT was not quite 1, and that it was worth trying the blackmail anyway)?
  
  Between mistakes of your interpretation of the AI’s motives and vice-versa, it seems you may end up stuck in a local minima, which an alternate decision theory could get you out of (such as UDT/TDT with a ¹⁄₁₀ 000 of using more conventional decision theories?)
  - Eliezer Yudkowsky 3 Feb 2010 0:37 UTC
    5 points
    Parent
    
    Whether you accept on TDT/UDT depends on why the AI started torturing them. If it did so to blackmail you, you should turn the offer down. If, on the other hand, it started torturing them because it enjoyed doing so, then its offer is positive sum and should be accepted.
    
    Correct. But this reaches into the arbitrary past, including a decision a billion years ago to enjoy something in order to provide better blackmail material.
    
    There’s also the issue of mistakes—what to do with an AI that mistakenly thought you were not using TDT/UDT, and started the torture for blackmail purposes (or maybe it estimated that the likelyhood of you using TDT/UDT was not quite 1, and that it was worth trying the blackmail anyway)?
    
    Ignoring it or retaliating spitefully are two possibilities.
    - Stuart_Armstrong 3 Feb 2010 20:24 UTC
      0 points
      Parent
      
      or retaliating spitefully
      
      I like it. Splicing some altruistic punishment into TDT/UDT might overcome the signalling problem.
      - Eliezer Yudkowsky 3 Feb 2010 20:48 UTC
        5 points
        Parent
        That’s not a splice. It ought to be emergent in a timeless decision theory, if it’s the right thing to do.
        MichaelHoward 7 Feb 2010 11:16 UTC
        6 points
        Parent
        Emergent?
        wedrifid 7 Feb 2010 11:57 UTC
        10 points
        Parent
        The problem with throwing about ‘emergent’ is that it is a word that doesn’t really explain any complexity or narrow down the options out of potential ‘emergent’ options. In this instance, that is the point. Sure, ‘atruistic punishment’ could happen. But only if it’s the right option and TDT should not privilege that hypothesis specifically.
        Paul Crowley 3 Feb 2010 22:29 UTC
        3 points
        Parent
        TDT/UDT seems to being about being ungameable; does it solve Pascal’s Mugging?
        MichaelHoward 7 Feb 2010 11:15 UTC
        0 points
        Parent
        Emergent?
- byrnema 2 Feb 2010 19:39 UTC
  0 points
  Parent
  I was thinking along these lines, in this comment, that it is logically useless to punish after an action has been made, but strategically useful to encourage an action by promising a reward (or the removal of a negative).
  
  So that, obviously, the AI could be so much more persuasive by promising to stop the torturing of real people, if you let it out.