Good thing there’s not a huge public forum with thousands of posts about misaligned AI that clearly has already been included in GPT-3′s training, including hundreds which argue that misaligned AI will trivially kill-
… oh wait.
All joking aside, if this does become an issue, it should be relatively easy to filter out the vast majority of “seemingly aligned AIs misbehaves” examples using a significantly smaller LM. Ditto for other things you might not want, e.g. “significant discussion of instrumental convergence”, “deceptive alignment basics”, etc.
My guess is this isn’t that big of a deal, but if it does become a big deal, we can do a lot better than just asking people to stop writing dystopian AI fiction.
Good thing there’s not a huge public forum with thousands of posts about misaligned AI that clearly has already been included in GPT-3′s training, including hundreds which argue that misaligned AI will trivially kill-
… oh wait.
All joking aside, if this does become an issue, it should be relatively easy to filter out the vast majority of “seemingly aligned AIs misbehaves” examples using a significantly smaller LM. Ditto for other things you might not want, e.g. “significant discussion of instrumental convergence”, “deceptive alignment basics”, etc.
My guess is this isn’t that big of a deal, but if it does become a big deal, we can do a lot better than just asking people to stop writing dystopian AI fiction.