Vladimir_Nesov comments on Thoughts on “AI is easy to control” by Pope & Belrose

Vladimir_Nesov 2 Dec 2023 16:40 UTC
LW: 5 AF: 3
1
AF
Plans that rely on aligned AGIs working on alignment faster than humans would need to ensure that no AGIs work on anything else in the meantime. The reason humans have no time to develop alignment of superintelligence is that other humans develop misaligned superintelligence faster. Similarly by default very fast AGIs working on alignment end up having to compete with very fast AGIs working on other things that lead to misaligned superintelligence. Preventing aligned AGIs from building misaligned superintelligence is not clearly more manageable than preventing humans from building AGIs.
- ryan_greenblatt 2 Dec 2023 17:46 UTC
  LW: 5 AF: 2
  0
  AF Parent
  
  Plans that rely on aligned AGIs working on alignment faster than humans would need to ensure that no AGIs work on anything else in the meantime.
  
  This isn’t true. It could be that making an arbitrarily scalable solution to alignment takes X cognitive resources and in practice building an uncontrollably powerful AI takes Y cognitive resources with X < Y.
  
  (Also, this plan doesn’t require necessarily aligning “human level” AIs, just being able to get work out of them with sufficiently high productivity and low danger.)
  - Vladimir_Nesov 2 Dec 2023 18:17 UTC
    LW: 11 AF: 4
    7
    AF Parent
    I’m being a bit simplistic. The point is that it needs to stop being a losing or a close race, and all runners getting faster doesn’t obviously help with that problem. I guess there is some refactor vs. rewrite feel to the distinction between the project of stopping humans from building AGIs right now, and the project of getting first AGIs to work on alignment and global security in a post-AGI world faster than other AGIs overshadow such work. The former has near/concrete difficulties, the latter has nebulous difficulties that don’t as readily jump to attention. The whole problem is messiness and lack of coordination, so starting from scratch with AGIs seems more promising than reforming human society. But without strong coordination on development and deployment of first AGIs, the situation with activities of AGIs is going to be just as messy and uncoordinated, only unfolding much faster, and that’s not even counting the risk of getting a superintelligence right away.