Brendan Long comments on Enkrateia: a safe model-based reinforcement learning algorithm

Brendan Long 30 Nov 2023 17:40 UTC
4 points
2
Section 2.3 seems to be the part that addresses alignment, and the proposed solution is to use reinforcement learning (train the AI on examples of what humans would do) and then to give up (either by leaving a human in the loop forever or just deciding that turning people into paperclips really is better).

The way these kinds of problems keep getting buried deep in the writing (sometimes through linked PDF’s) makes me I really think this is some sort of Sokal-hoax-style prank.
- MadHatter 30 Nov 2023 17:43 UTC
  1 point
  0
  Parent
  What’s so bad about keeping a human in the loop forever? Do we really think we can safely abdicate our moral responsibilities?
  - Brendan Long 30 Nov 2023 18:03 UTC
    4 points
    4
    Parent
    It defeats the purpose of AI, so realistically no one will do it
    It doesn’t actually solve the problem if the AI is deceptive
    
    I’m not convinced we can safely run AGI, with or without a human in the loop. That’s what the alignment problem is.
    - MadHatter 30 Nov 2023 18:59 UTC
      −5 points
      −3
      Parent
      Then maybe the alignment problem is a stupid problem to try to solve? I don’t believe this, and have spent the past five years working on the alignment problem. But your argument certainly seems like a general purpose argument that we could and should surrender our moral duties to a fancy algorithm as a cost-saving measure, and that anyone who opposes that is a technophobe who Does Not Get the Science.