vi21maobk9vp comments on Why safe Oracle AI is easier than safe general AI, in a nutshell

vi21maobk9vp 3 Dec 2011 21:01 UTC
0 points
Nope. Building FAI vs building OAI means that in one case everyone wants to affect the actual AI built in another direction and in the second one everyone wants a copy. This means that in the second case actual safety is something all sides can collaborate on, even if indirectly.

Oracle AI technology can turn out in multiple hands at once with any ³⁄₄ of holders being able to calm down the coalition of any ¹⁄₄. This may help stabilizing the system and create a set of Intelligences who value cooperation. In any case, this probably gives more time to do this.
- prase 3 Dec 2011 21:24 UTC
  1 point
  Parent
  The villain asks the Oracle: “How do I build a Wunderwaffe (a virus that kills humanity, an UFAI) for myself?” The oracle returns the plans for building such a thing, since it only wishes to answer questions correctly. How does the rest of humanity prevent the doom once the information is released?
  
  Well, if the questions are somehow censored before given to the AI, we perhaps get some additional safety. Until some villain discovers how to formulate the question to pass it through the censors undetected. Or discovers destructive potential in an answer to question asked by somebody else.
  
  Anyway, the original post effectively says that Oracles are safe because all people agree what they should do: answer the questions. This hinges on idea of robots endangering us only via direct power and disregards the gravest danger of super-human intelligence: revealing dangerous information which can be used to make things whose consequences we are unable to predict.
  - Stuart_Armstrong 4 Dec 2011 10:21 UTC
    0 points
    Parent
    But the oracle will be able to predict these consequences, and we’ll probably get into the habit of checking these.
    - prase 4 Dec 2011 18:04 UTC
      0 points
      Parent
      The problem is that the question “what would be the consequences” is too general to be answered exhaustively. We should at least have an idea about the general characteristics of the risk to ask more specifically; the Oracle doesn’t know what consequences are important for us unless it already comprehends human values an is thus already “friendly”.
  - vi21maobk9vp 3 Dec 2011 21:37 UTC
    0 points
    Parent
    Well, after a small publicity campaign, villains will start to ask Oracles whether there [b]is[/b] any world to rule after they take over the world. No really, XX century teaches us that MAD is something that can calm people with power reliably.
    
    Virus that kills 100% of humanity armed with more information processing power to counter it than the virus designer has to build it is not easy to create. 75% may be easy enough at some stage; but it is not an existential risk. On the plus side we may be able to use the OAIs on the good side to fight multiply resistant bug strains in the case they become pathogenic.
    - Prismattic 3 Dec 2011 22:42 UTC
      1 point
      Parent
      
      No really, XX century teaches us that MAD is something that can calm people with power reliably.
      
      One should be reluctant to generalize from a very small dataset, particularly when the stakes are this high.
      - vi21maobk9vp 4 Dec 2011 9:44 UTC
        0 points
        Parent
        I agree that we have too few well-documented cases. But there are also some reasons behind MAD being effective. It doesn’t look like MAD is fluctuation. It is not a bulletproof evidence, but it is sme evendence.
        
        Also, it is complementary to the second part: MAD via OAI means also high chances of partial parrying the strike.