For each token prediction we record the activation of the neuron and whether on not ” an” has a greater logit than any other token (if it was the top prediction).
We group the activations into buckets of width 0.2. For each bucket we plot
Number of times ‘‘an" was the top prediction (for activations in this bucket)Number of activations in this bucket
Hi!
For each token prediction we record the activation of the neuron and whether on not ” an” has a greater logit than any other token (if it was the top prediction).
We group the activations into buckets of width 0.2. For each bucket we plot
Number of times ‘‘ an" was the top prediction (for activations in this bucket)Number of activations in this bucket
Does that clarify things for you?