Wei Dai comments on Contest: $1,000 for good questions to ask to an Oracle AI

Wei Dai 26 Aug 2019 19:24 UTC
LW: 9 AF: 5
AF

This is similar to existing solutions, but slightly more meta

I feel like this is about equally meta as my “Superintelligent Agent” submission, since my committee could output “Show the following message to the operator: …” and your message could say “I suggest that you perform the following action: …”, so the only difference between your idea and mine is that in my submission the output of the Oracle is directly coupled to some effectors to let the agent act faster, and yours has a (real) human in the loop.

The more UFAI risk grows, the less you should use oracles.

Hmm, good point. I guess Chris Leong made a similar point, but it didn’t sink in until now how general the concern is. This seems to affect Paul’s counterfactual oversight idea as well, and maybe other kinds of human imitations and predictors/oracles, as well as things that are built using these components like quantilizers and IDA.
- cousin_it 27 Aug 2019 12:06 UTC
  LW: 22 AF: 11
  AF Parent
  Thinking about this some more, all high-bandwidth oracles (counterfactual or not) risk receiving messages crafted by future UFAI to take over the present. If the ranges of oracles overlap in time, such messages can colonize their way backwards from decades ahead. It’s especially bad if humanity’s FAI project depends on oracles—that increases the chance of UFAI in the world where oracles are silent, which is where the predictions come from.
  
  One possible precaution is to use only short-range oracles, and never use an oracle while still in prediction range of any other oracle. But that has drawbacks: 1) it requires worldwide coordination, 2) it only protects the past. The safety of the present depends on whether you’ll follow the precaution in the future. And people will be tempted to bend it, use longer or overlapping ranges to get more power.
  
  In short, if humanity starts using high-bandwidth oracles, that will likely increase the chance of UFAI and hasten it. So such oracles are dangerous and shouldn’t be used. Sorry, Stuart :-)
  What links here?
  - Wei Dai 27 Aug 2019 22:41 UTC
    LW: 4 AF: 2
    AF Parent
    
    Thinking about this some more, all high-bandwidth oracles (counterfactual or not) risk receiving messages crafted by future UFAI to take over the present.
    
    Note that in the case of counterfactual oracle, this depends on UFAI “correctly” solving counterfactual mugging (i.e., the UFAI has to decide to pay some cost in its own world to take over a counterfactual world where the erasure event didn’t occur).
    
    So such oracles are dangerous and shouldn’t be used.
    
    This seems too categorical. Depending on the probabilities of various conditions, using such oracles might still be the best option in some circumstances.
    What links here?
    Analysing: Dangerous messages from future UFAI via Oracles by Stuart_Armstrong (22 Nov 2019 14:17 UTC; 22 points)
    - Stuart_Armstrong 22 Nov 2019 14:18 UTC
      LW: 2 AF: 1
      AF Parent
      Some thoughts on that idea: https://www.lesswrong.com/posts/6WbLRLdmTL4JxxvCq/analysing-dangerous-messages-from-future-ufai-via-oracles
    - cousin_it 27 Aug 2019 22:54 UTC
      LW: 2 AF: 1
      AF Parent
      Yeah, agreed on both points.
  - Stuart_Armstrong 22 Nov 2019 14:18 UTC
    LW: 3 AF: 2
    AF Parent
    Some thoughts on this idea, thanks for it: https://www.lesswrong.com/posts/6WbLRLdmTL4JxxvCq/analysing-dangerous-messages-from-future-ufai-via-oracles
  - Stuart_Armstrong 27 Aug 2019 21:23 UTC
    LW: 3 AF: 2
    AF Parent
    Very worthwhile concern, and I will think about it more.
    - Gurkenglas 28 Aug 2019 1:00 UTC
      3 points
      Parent
      In case of erasure, you should be able to get enough power to prevent another UFAI summoning session.
      - cousin_it 28 Aug 2019 11:00 UTC
        4 points
        Parent
        Sure, in case of erasure you can decide to use oracles less, and compensate your clients with money you got from “erasure insurance” (since that’s a low probability event). But that doesn’t seem to solve the problem I’m talking about—UFAI arising naturally in erasure-worlds and spreading to non-erasure-worlds through oracles.
        
        Gurkenglas 28 Aug 2019 11:25 UTC
        5 points
        Parent
        The problem you were talking about seemed to rely on bucket brigades. I agree that UFAIs jumping back a single step is a fair concern. (Though I guess you could counterfactually have enough power to halt AGI research completely...) I’m trying to address it elsethread. :)
        
        cousin_it 28 Aug 2019 11:52 UTC
        4 points
        Parent
        Ah, sorry, you’re right. To prevent bucket brigades, it’s enough to stop using oracles for N days whenever an N-day oracle has an erasure event, and the money from “erasure insurance” can help with that. When there are no erasure events, we can use oracles as often as we want. That’s a big improvement, thanks!
        
        Stuart_Armstrong 29 Aug 2019 2:26 UTC
        2 points
        Parent
        Good idea.
- cousin_it 26 Aug 2019 21:37 UTC
  LW: 6 AF: 3
  AF Parent
  Yeah. And low-bandwidth oracles can have a milder version of the same problem. Consider your “consequentialist” idea: if UFAI is about to arise, and one of the offered courses of action leads to UFAI getting stopped, then the oracle will recommend against that course of action (and for some other course where UFAI wins and maxes out the oracle’s reward).