Signer comments on A central AI alignment problem: capabilities generalization, and the sharp left turn

Signer 15 Jun 2022 23:08 UTC
2 points
−2
I feel like it needs more ML-inspired metaphors. Sure anyone can imagine gradient descent arranging weights into encoding of Skynet’s source code—what do people say/think about why they don’t check for this before training GPT with loss function that would totally love Skynet?