Jeffrey Ladish comments on AGI systems & humans will both need to solve the alignment problem

Jeffrey Ladish 10 Mar 2023 6:32 UTC
2 points
0
Thanks Buck, btw the second link was broken for me but this link works: https://cepr.org/voxeu/columns/ai-and-paperclip-problem Relevant section:
Computer scientists, however, believe that self-improvement will be recursive. In effect, to improve, and AI has to rewrite its code to become a new AI. That AI retains its single-minded goal but it will also need, to work efficiently, sub-goals. If the sub-goal is finding better ways to make paperclips, that is one matter. If, on the other hand, the goal is to acquire power, that is another.
The insight from economics is that while it may be hard, or even impossible, for a human to control a super-intelligent AI, it is equally hard for a super-intelligent AI to control another AI. Our modest super-intelligent paperclip maximiser, by switching on an AI devoted to obtaining power, unleashes a beast that will have power over it. Our control problem is the AI’s control problem too. If the AI is seeking power to protect itself from humans, doing this by creating a super-intelligent AI with more power than its parent would surely seem too risky.
Claim seems much too strong here, since it seems possible this won’t turn out to that difficult for AGI systems to solve (copies seem easier than big changes imo, but not sure), but it also seems plausible it could be hard.
- Buck 10 Mar 2023 14:53 UTC
  2 points
  0
  Parent
  Yeah, I agree copies are easier to work with; this is why I think that their situation is very analogous to new brain uploads.