Illustrating the way in which many residual stream basis directions have high correlation when considering top and random fragments together, but low to zero correlation when looking within a fragment.
I think the final layer’s output determines the complexity of the prediction that will be used to determine the vocab library that the model will generate its output from. I would vote on the random fragment chart (1st image) here.
I think the final layer’s output determines the complexity of the prediction that will be used to determine the vocab library that the model will generate its output from. I would vote on the random fragment chart (1st image) here.