Nicholas Kross comments on Aligned AI is dual use technology

Nicholas Kross 29 Jan 2024 0:19 UTC
3 points
1
If it helps clarify: I (and some others) break down the alignment problem into “being able to steer it at all” and “what to steer it at”. This post is about the danger of having the former solved, without the latter being solved well (e.g. through some kind of CEV).
- Thane Ruthenis 29 Jan 2024 1:06 UTC
  18 points
  12
  Parent
  Nah, I think this post is about a third component of the problem: ensuring that the solution to “what to steer at” that’s actually deployed is pro-humanity. A totalitarian government successfully figuring out how to load its regime’s values into the AGI has by no means failed at figuring out “what to steer at”. They know what they want and how to get it. It’s just that we don’t like the end result.
  “Being able to steer at all” is a technical problem of designing AIs, “what to steer at” is a technical problem of precisely translating intuitive human goals into a formal language, and “where is the AI actually steered” is a realpolitiks problem that this post is about.
  - Nicholas Kross 29 Jan 2024 18:25 UTC
    4 points
    0
    Parent
    Ah, yeah that’s right.