Noosphere89 comments on Resources that (I think) new alignment researchers should know about

Noosphere89 30 Oct 2022 13:09 UTC
2 points
1
Primarily because right now, we’re not even close to that goal. We’re trying to figure out how to avoid deceptive alignment right now.
- geoffreymiller 30 Oct 2022 18:07 UTC
  1 point
  0
  Parent
  If we’re nowhere close to solving alignment well enough that even a coarse-grained description of actual human values is relevant yet, then I don’t understand why anyone is advocating further AI research at this point.
  Also, ‘avoiding deceptive alignment’ doesn’t really mean anything if we don’t have a relatively rich and detailed description of what ‘authentic alignment’ with human values would look like.
  I’m truly puzzled by the resistance that the AI alignment community has against learning a bit more about the human values we’re allegedly aligning with.
  - Chipmonk 28 Sep 2023 10:15 UTC
    1 point
    0
    Parent
    I agree with this. What if it was actually possible to formalize morality? (Cf «Boundaries» for formalizing an MVP morality.) Inner alignment seems like it would be a lot easier with a good outer alignment function!
  - Noosphere89 30 Oct 2022 19:09 UTC
    1 point
    0
    Parent
    Mostly because ambitious value learning is really fucking hard, and this proposal falls into all the problems that ambitious or narrow value learning has.
    
    You’re right though that AI capabilities will need to slow down, and I am not hopeful here.