Exciting stuff, thanks!
It’s a little surprising to me how bad the logit lens is for earlier layers.
It was surprising to me too. It is possible that the layers do not have aligned basis vectors. That’s why corroborating the results with a TunedLens is a smart next step, as they currently may be misleading.
Exciting stuff, thanks!
It’s a little surprising to me how bad the logit lens is for earlier layers.
It was surprising to me too. It is possible that the layers do not have aligned basis vectors. That’s why corroborating the results with a TunedLens is a smart next step, as they currently may be misleading.