Those are good questions! There’s some existing research which address some of your questions.
Single neurons often do represent multiple concepts: https://transformer-circuits.pub/2022/toy_model/index.html
It seems to still be unclear why the dimensions are aligned with the standard basis: https://transformer-circuits.pub/2023/privileged-basis/index.html
Those are good questions! There’s some existing research which address some of your questions.
Single neurons often do represent multiple concepts: https://transformer-circuits.pub/2022/toy_model/index.html
It seems to still be unclear why the dimensions are aligned with the standard basis: https://transformer-circuits.pub/2023/privileged-basis/index.html