acylhalide comments on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment

[ ]
[deleted]
- Rob Bensinger 12 Dec 2021 23:36 UTC
  13 points
  Parent
  There are two problems here:
  - Problem #1: Align limited task AGI to do some minimal act that ensures no one else can destroy the world with AGI.
  - Problem #2: Solve the full problem of using AGI to help us achieve an awesome future.
  Problem #1 is the one I was talking about in the OP, and I think of it as the problem we need to solve on a deadline. Problem #2 is also indispensable (and a lot more philosophically fraught), but it’s something humanity can solve at its leisure once we’ve solved #1 and therefore aren’t at immediate risk of destroying ourselves.
  What links here?
  - Rob Bensinger's comment on Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment by Rob Bensinger (13 Dec 2021 22:57 UTC; 8 points)
  - [ ]
    [deleted]
    - UHMWPE-UwU 13 Dec 2021 16:56 UTC
      18 points
      Parent
      MIRI had a strategic explanation in their 2017 fundraiser post which I found very insightful. This was called the “acute risk period”.
- RomanS 12 Dec 2021 18:21 UTC
  2 points
  Parent
  One way to resolve the “Alice vs Bob values” problem is to delegate it to the existing societal structures.
  For example, Alice is the country’s president. The AI is aligned specifically to the current president’s values (with some reasonable limitations, like requiring a congress’ approval for each AI action).
  If Bob’s values are different, that’s a Bob’s problem, not the problem of AI alignment.
  The solution is far from perfect, but it does solve the “Alice vs Bob values” problem, and is much better than the rogue-AGI-killing-all-humans scenario.
  By this or some similar mechanism, the scope of the alignment problem can be reduced to 1 human, which is easier to solve. This way, the problem is reduced from “solve society” to “solve an unusually hard math problem”.
  And after you got a superhuman AGI that is aligned to 1 human, the human could ask it to generalize the solution to the entire humanity.
  - jacob_cannell 12 Dec 2021 19:22 UTC
    4 points
    Parent
    And after you got a superhuman AGI that is aligned to 1 human, the human could ask it to generalize the solution to the entire humanity.
    Aligned to which human?
    - RomanS 13 Dec 2021 5:59 UTC
      1 point
      Parent
      Aligned to which human?
      Depends on what we are trying to maximize.
      If we seek societal acceptance of the solution, then the secretary-general of the UN is probably the best choice.
      If we seek the best possible outcome for humanity, then I would vote for Eliezer. It is unlikely that there is a more suitable person to speak with a Bayesian superintelligence on behalf of humanity.
      If we want to maximize the scenario’s realism, then some dude from Google/OpenAI is more likely to be the human in question.