I feel like it needs more ML-inspired metaphors. Sure anyone can imagine gradient descent arranging weights into encoding of Skynet’s source code—what do people say/think about why they don’t check for this before training GPT with loss function that would totally love Skynet?
I feel like it needs more ML-inspired metaphors. Sure anyone can imagine gradient descent arranging weights into encoding of Skynet’s source code—what do people say/think about why they don’t check for this before training GPT with loss function that would totally love Skynet?