Wei Dai comments on On the purposes of decision theory research

Wei Dai 25 Jul 2019 16:45 UTC
8 points
Aside from global coordination to not build AGI at all until we can exhaustively research all aspects of AI safety (which I really wish was feasible but we don’t seem to live in that world), I’m not sure how to avoid “passing the buck of complexity to the superintelligent AI” in some form or another. It seems to me that if we’re going to build superintelligent AI, we’ll need superintelligent AI to do philosophy for us (otherwise philosophical progress will fall behind other kinds of progress which will be disastrous), so we need to figure out how to get it to do that safely/correctly, which means solving metaphilosophy.

Do you see any other (feasible) alternatives to this?
- Oskar Press Mathiasen 25 Jul 2019 20:27 UTC
  2 points
  Parent
  It seems to me that you believe that there is a amount of work in the topic of philosophy, that would allow us to be certain, that if we can create a super intelligent ai we can guarantee that it is safe, but that this amount of work is impossible to do, before we develop super intelligent ai, or at least close to impossible. But it doesn’t seem obvious to me (and presumably others) that this is true. It might be a very long time before we create ai, or we might succeed in motivating enough people to work on the problem, that we could achieve this level of philosophical understanding, before it is possible to create super intelligent ai.
  - Wei Dai 26 Jul 2019 17:40 UTC
    4 points
    Parent
    You can get a better sense of where I’m coming from by reading Some Thoughts on Metaphilosophy. Let me know if you’ve already read it and still have questions or disagreements.
- avturchin 26 Jul 2019 10:55 UTC
  1 point
  Parent
  I think that there could be other ways to escape this alternative. In fact, I wrote a list of the possible ideas of “global solutions” (e.g. ban AI, take over the world, create many AIs) here.
  Some possible ideas (not necessary good ones) are:
  - Use the first human upload as effective AI police which prevent creations of any other AI.
  - Use other forms of the narrow AI to take over the world and to create effective AI Police which is capable to find unauthorised AI research and stop it.
  - Drexler’s CAIS.
  - Something like Christiano’s approach. A group of people augmented by a narrow AI form a “human-AI-Oracle” and solves philosophy.
  - Active AI boxing as a commercial service.
  - Human augmentation.
  Most of these ideas are centered around the ways of getting high-level real-world capabilities by combining limited AI with something powerful in the outside world (humans, data, nuclear power, market forces, active box), and then using these combined capabilities to prevent creation really dangerous AI.
  - Wei Dai 26 Jul 2019 16:27 UTC
    3 points
    Parent
    None of these ideas seem especially promising even for achieving temporary power over the world (sufficient for preventing creation of other AI).
    It seems even harder to achieve a long-term stable and safe world environment, in which we can take our time to solve remaining philosophy and AI safety problems and eventually realize the full potential value of the universe.
    Some of them (other forms of the narrow AI to take over the world, Christiano’s approach) seem to require solving something like decision theory or metaphilosophy anyway to ensure safety.