Great list! Would you consider”The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks”
https://arxiv.org/abs/2306.17844 a candidate for “important work in mech interp [which] has properly built on [Progress Measures.]” ?Are you aware of any problems with it?
I’m not aware of any problems with it. I think it’s a nice paper, but not really at my bar for important work (which is a really high bar, to be clear—fewer than half the papers in this post probably meet it)
Great list! Would you consider
”The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks”
https://arxiv.org/abs/2306.17844
a candidate for “important work in mech interp [which] has properly built on [Progress Measures.]” ?
Are you aware of any problems with it?
I’m not aware of any problems with it. I think it’s a nice paper, but not really at my bar for important work (which is a really high bar, to be clear—fewer than half the papers in this post probably meet it)