For each token prediction we record the activation of the neuron and whether on not β anβ has a greater logit than any other token (if it was the top prediction).
We group the activations into buckets of width 0.2. For each bucket we plot
Number of times ββan" was the top prediction (for activations in this bucket)Number of activations in this bucket
Hi!
For each token prediction we record the activation of the neuron and whether on not β anβ has a greater logit than any other token (if it was the top prediction).
We group the activations into buckets of width 0.2. For each bucket we plot
Number of times ββ an" was the top prediction (for activations in this bucket)Number of activations in this bucket
Does that clarify things for you?