John_Maxwell comments on Two Small Experiments on GPT-2

John_Maxwell 6 Mar 2019 21:10 UTC
5 points

If you literally ran (a powered-up version of) GPT-2 on “A brilliant solution to the AI alignment problem is...” you would get the sort of thing an average internet user would think of as a brilliant solution to the AI alignment problem.

Change it to: “I’m a Turing Award winner and Fields medalist, and last night I had an incredible insight about how to solve the AI alignment problem. The insight is...” It’s improbable that a mediocre quality idea will follow. (Another idea: write a description of an important problem in computer science, followed by “The solution is...”, and then a brilliant solution someone came up with. Do this for a few major solved problems in computer science. Then write a description of the AI alignment problem, followed by “The solution is...”, and let GPT-2 continue from there.)

Trying to do this more usefully basically leads to Paul’s agenda (which is about trying to do imitation learning of an implicit organization of humans)

One take: Either GPT-2 can be radically improved (to offer useful completions as in the “Turing Award” example above), or it can’t be. If it can be radically improved, it can help with FAI, perhaps by contributing to Paul’s agenda. If it can’t be radically improved, then it’s not important for AGI. So GPT-2 is neutral or good.