Guillaume Charrier comments on Discussion with Nate Soares on a key alignment difficulty

Guillaume Charrier 14 Apr 2023 17:55 UTC
1 point
0
But then would a less intelligent being (i.e. the collectivity of human alignment researchers and less powerful AI systems that they use as tool in their research) be capable of validly examining a more intelligent being, without being deceived by the more intelligent being?
- HoldenKarnofsky 14 Apr 2023 18:02 UTC
  2 points
  0
  Parent
  It seems like the same question would apply to humans trying to solve the alignment problem—does that seem right? My answer to your question is “maybe”, but it seems good to get on the same page about whether “humans trying to solve alignment” and “specialized human-ish safe AIs trying to solve alignment” are basically the same challenge.