HoldenKarnofsky comments on Discussion with Nate Soares on a key alignment difficulty

HoldenKarnofsky 14 Apr 2023 18:02 UTC
2 points
0
It seems like the same question would apply to humans trying to solve the alignment problem—does that seem right? My answer to your question is “maybe”, but it seems good to get on the same page about whether “humans trying to solve alignment” and “specialized human-ish safe AIs trying to solve alignment” are basically the same challenge.