Rob Bensinger comments on Why all the fuss about recursive self-improvement?

Rob Bensinger 12 Jun 2022 23:41 UTC
7 points
6
Other issues with “just stop at human-level” include:
- We don’t actually know how to usefully measure or upper-bound the capability of an AGI. Relying on past trends in ‘how much performance tends to scale with compute’ seems extremely unreliable and dangerous to me when you first hit AGI. And it becomes completely unreliable once the system is potentially modeling its operators and adjusting its visible performance in attempts to influence operators’ beliefs.
- AI will never have the exact same skills as humans. At some levels, AI might be subhuman in many ways, superhuman in many others, and roughly par-human in still others. Safety in that case will depend on the specific skills the AI does or doesn’t have.
- Usefulness/relevance will also depend on the specific skills the AI has. Some “human-level AIs” may be useless for pivotal acts, even if you know how to perfectly align them.
I endorse “don’t crank your first AGI systems up to maximum”—cranking up to maximum seems obviously suicidal to me. Limiting capabilities is absolutely essential.
But I don’t think this solves the problem on its own, and I think achieving this will be more complicated and precarious than the phrasing “human-level AI” might suggest.