Steven Byrnes comments on $1000 bounty for OpenAI to show whether GPT3 was “deliberately” pretending to be stupider than it is

Steven Byrnes 21 Jul 2020 19:49 UTC
4 points
Here’s a possible way to prove pretending-to-be-stupid maybe: we could try to prompt it in such a way that all its answers (to true/false questions) are wrong, much more often than chance. If it’s able to do that, then we can ask, how? One possibility is: It is implicitly figuring out the truth and then saying the opposite. If we’re careful, maybe we can set things up such that that’s the only possibility.
(I’m not convinced that such a demo would really teach us anything that wasn’t obvious, but I dunno.)