ryan_greenblatt comments on Thoughts on “AI is easy to control” by Pope & Belrose

ryan_greenblatt 2 Dec 2023 17:46 UTC
LW: 5 AF: 2
0
AF

Plans that rely on aligned AGIs working on alignment faster than humans would need to ensure that no AGIs work on anything else in the meantime.

This isn’t true. It could be that making an arbitrarily scalable solution to alignment takes X cognitive resources and in practice building an uncontrollably powerful AI takes Y cognitive resources with X < Y.

(Also, this plan doesn’t require necessarily aligning “human level” AIs, just being able to get work out of them with sufficiently high productivity and low danger.)
- Vladimir_Nesov 2 Dec 2023 18:17 UTC
  LW: 11 AF: 4
  7
  AF Parent
  I’m being a bit simplistic. The point is that it needs to stop being a losing or a close race, and all runners getting faster doesn’t obviously help with that problem. I guess there is some refactor vs. rewrite feel to the distinction between the project of stopping humans from building AGIs right now, and the project of getting first AGIs to work on alignment and global security in a post-AGI world faster than other AGIs overshadow such work. The former has near/concrete difficulties, the latter has nebulous difficulties that don’t as readily jump to attention. The whole problem is messiness and lack of coordination, so starting from scratch with AGIs seems more promising than reforming human society. But without strong coordination on development and deployment of first AGIs, the situation with activities of AGIs is going to be just as messy and uncoordinated, only unfolding much faster, and that’s not even counting the risk of getting a superintelligence right away.