Viliam_Bur comments on [LINK] Another “LessWrongers are crazy” article—this time on Slate

Viliam_Bur 18 Jul 2014 12:09 UTC
6 points
Also, in Newcomb’s problem, the goal is to go away with as much money as possible. So it’s obvious what to optimize for.

What exactly is the goal with the Basilisk? To give as much money as possible, just to build an evil machine which would torture you unless you gave it as much money as possible, but luckily you did, so you kinda… “win”? You and your five friends are the selected ones who will get the enjoyment of watching the rest of humanity tortured forever? (Sounds like how some early Christians imagined Heaven. Only the few most virtuous ones will get saved, and watching the suffering of the damned in Hell will increase their joy of their own salvation.)

Completely ignoring the problem that just throwing a lot of money around doesn’t solve the problem of creating a safe recursively self-improving superhuman AI. (Quoting Sequences: “There’s a fellow currently on the AI list who goes around saying that AI will cost a quadrillion dollars—we can’t get AI without spending a quadrillion dollars, but we could get AI at any time by spending a quadrillion dollars.”) So these guys working on this evil machine… hungry, living in horrible conditions, never having a vacation or going on a date, never seeing a doctor, probably having mental breakdowns all the time; because they are writing the code that would torture them if they did any of that… is this the team we could trust with doing sane and good decisions, and getting all the math right? If no, then we are pretty much fucked regardless of whether we donated to the Basilisk or not, because soon we are all getting transformed to paperclips anyway; the only difference is that 99.9999999% of us will get tortured before that.

How about, you know, just not building the whole monster at the first place? Uhm… could the solution to this horrible problem really be so easy?
- wedrifid 18 Jul 2014 16:29 UTC
  9 points
  Parent
  
  How about, you know, just not building the whole monster at the first place? Uhm… could the solution to this horrible problem really be so easy?
  
  Yes.
- ChristianKl 18 Jul 2014 15:48 UTC
  5 points
  Parent
  
  What exactly is the goal with the Basilisk? To give as much money as possible, just to build an evil machine which would torture you unless you gave it as much money as possible, but luckily you did, so you kinda… “win”? You and your five friends are the selected ones who will get the enjoyment of watching the rest of humanity tortured forever? [...] Uhm… could the solution to this horrible problem really be so easy?
  
  No. All people who never heard of the Basilisk argument would also live in heaven. Even all people who heard of it in a way where it was clear that they wouldn’t take it seriously would live in heaven.
  - wedrifid 18 Jul 2014 16:03 UTC
    3 points
    Parent
    
    No. All people who never heard of the Basilisk argument would also live in heaven. Even all people who heard of it in a way where it was clear that they wouldn’t take it seriously would live in heaven.
    
    That isn’t necessarily true. The kind of reasoning assumed in the Basilisk uFAI would also use the ‘innocents’ as hostages if it would help to extort compliance from the believers. It depends entirely on the (economic power weighted aggregate) insanity of the ‘suckers’ the uFAI is exploiting.
    - ChristianKl 18 Jul 2014 17:18 UTC
      1 point
      Parent
      The basilisk get’s more compliance from the believers when he puts the innocents into heaven then when he puts them into hell. Also the debate is not about an UFAI but a FAI that optimizes the utility function of general welfare with TDT.
      
      This is also the point, where you might think about how Eliezer’s censorship had an effect. His censuring did lead you and Viliam_Bur to have an understanding of the issue where you think it’s about an UFAI.
      - wedrifid 19 Jul 2014 1:12 UTC
        8 points
        Parent
        
        The basilisk get’s more compliance from the believers when he puts the innocents into heaven then when he puts them into hell.
        
        This is at best not clear. It depends on the specific nature of the insanity in the compliant. Note that brutally disincentivizing evangelism has… instrumental downsides.
        
        the believers
        
        Don’t be misled by the loose relationship with Pascal’s Wager. This isn’t about belief, it is about decisions (and counterfactual decisions).
        
        Also the debate is not about an UFAI but
        
        The use of the term uFAI is deliberate, and correct. We don’t need to define a torture-terrorist as Friendly just because of some sloppy utilitarian reasoning. Moreover, any actual risk from the scenario comes from AGI creators (or influencers) that make this assumption. That’s the only thing that can cause the torture to happen.
        
        His censuring did lead you and Viliam_Bur to have an understanding of the issue where you think it’s about an UFAI.
        
        You are overconfident in your mind reading skills. I was one of the few people who were familiar enough with the subject matter at the time when Roko was writing his (typically fascinating) posts that I categorised the agent as a plausible not-friendly AGI immediately, the scenario as an interesting twist on acausal extortion then went straight to thinking about the actual content of the post, which was about a new means of cooperation.
      - XiXiDu 19 Jul 2014 7:57 UTC
        5 points
        Parent
        
        Also the debate is not about an UFAI but a FAI that optimizes the utility function of general welfare with TDT.
        
        Roko’s post explicitly mentioned trading with unfriendly AI’s.
      - drethelin 18 Jul 2014 17:38 UTC
        1 point
        Parent
        yeah, the horror lies in the idea that it might be morally CORRECT for an FAI to engage in eternal torture of some people.
        Viliam_Bur 19 Jul 2014 9:25 UTC
        7 points
        Parent
        There is this problem with human psychology that threatening someone with torture doesn’t contribute to their better judgement.
        
        If threatening someone with eternal torture would magically raise their intelligence over 9000 and give them ability to develop a correct theory of Friendliness and reliably make them build a Friendly AI in five years… then yes, under these assumptions, threatening people with eternal torture could be the morally correct thing to do.
        
        But human psychology doesn’t work this way. If you start threatening people with torture, they are more likely to make mistakes in their reasoning. See: motivated reasoning, “ugh” fields, etc.
        
        Therefore, the hypothetical AI threatening people with torture for… well, pretty much for not being perfectly epistemically and instrumentally rational… would decrease the probability of Friendly AI being built correctly. Therefore, I don’t consider this hypothetical AI to be Friendly.
        lsparrish 20 Jul 2014 16:34 UTC
        2 points
        Parent
        [removed]
- Richard_Kennaway 18 Jul 2014 16:53 UTC
  1 point
  Parent
  
  How about, you know, just not building the whole monster at the first place? Uhm… could the solution to this horrible problem really be so easy?
  
  This question is equivalent to: “How about, you know, just building a Friendly AI? Uhm… could the solution to the safe AI problem really be so easy?”
  - roystgnr 18 Jul 2014 18:33 UTC
    −3 points
    Parent
    These questions are equivalent in the same sense as “how about just not setting X equal to pi” and “how about just setting X equal to e” are equivalent. Assuming you can do the latter is a prediction; assuming you can do the former is an antiprediction.
    
    To the contrary, “just building the [very specific sort of] whole monster” is what’s more equivalent to “just building a [very specific definition of] Friendly AI”, an a priori improbable task.
    
    Worse for the basilisk: at least in the case of Friendly AI you might end up stuck with nothing better to do but throw a dart and hope for a bulls-eye. But in the case of the basilisk, the acausal trade is only rational if you expect a high likelihood of the trade being carried out. But if that likelihood is low then you’re just being nutty, which means it’s unlikely for the other side of the trade to be upheld in any case (acausally trying to influence Omega’s prediction of you may work if Omega is omniscient, but not so well if Omega is irrational). This lowers the likelihood still further… until the only remaining question is simply “what’s the fixed point of “x_{n+1} = x_n/2″?”
    - Richard_Kennaway 18 Jul 2014 18:41 UTC
      −1 points
      Parent
      
      These questions are equivalent in the same sense as “how about just not setting X equal to pi” and “how about just setting X equal to e” are equivalent. Assuming you can do the latter is a prediction; assuming you can do the former is an antiprediction.
      
      Consider my parallel changed to “How about, you know, just not building an Unfriendly AI? Uhm… could the solution to the safe AI problem really be so easy?”
      - Viliam_Bur 19 Jul 2014 9:17 UTC
        7 points
        Parent
        There are many possible Unfriendly AI, and most of them don’t base their decision of torturing you on whether you gave them all your money.
        
        Therefore, you can use your reason to try building a Friendly AI… and either succeed or fail, depending on the complexity of the problem and your ability to solve it.
        
        But not depending on a blackmail.
        
        This is the difference between “you should be very careful to avoid building any Unfriendly AI, which may be a task beyond your skills”, and “you should build this specific Unfriendly AI, because if you don’t, but someone else does, then it will torture you for an eternity”. In the former case, your intelligence is used to generate a good outcome, and yes, you may fail. In the latter case, your intelligence is used to fight against itself; you are are forcing yourself to work towards an outcome that you actually don’t want.
        
        That’s not the same thing. Building a Friendly AI is insanely difficult. Building a Torture AI is insane and difficult.