sanxiyn comments on New OpenAI Paper—Language models can explain neurons in language models

sanxiyn 11 May 2023 6:50 UTC
1 point
0
GPT-6 will probably be able to analyze all the neurons in itself with >0.5 scores
This seems to assume the task (writing explanations for all neurons with >0.5 scores) is possible at all, which is doubtful. Superposition and polysemanticity are certainly things that actually happen.