gwern comments on interpreting GPT: the logit lens

gwern 2 Sep 2020 0:02 UTC
LW: 5 AF: 2
AF
Doing it with GPT-3 would be quite challenging just for compute requirements like RAM. You’d want to test this out on GPT-2-117M first, definitely. If the approach works at all, it should work well for the smallest models too.