aaronsnoswell comments on We Found An Neuron in GPT-2

aaronsnoswell 23 Feb 2023 1:49 UTC
1 point
0
Hello! A great write-up and fascinating investigation. Well done with such a great result from a hackathon.
I’m trying to understand your plot titled ‘Proportion of Top Predictions that are ” an” by Layer 31 Neuron 892 Activation’. Can you explain what the y-axis is in this plot? It’s not clear what the y-axis is a proportion of.
I read through the code, but couldn’t quite follow the logic for this plot. It seems that the y-axis is computed with these lines;
```
neuron_act_top_pred_proportions = [dict(sorted([(k / bin_granularity, v["top_pred"] / v["count"])
                                                for k, v in logit_bins.items()])) for logit_bins in logit_diff_bins.values()]
```
But I’m not sure what the numerator v["count"] from within logit_bins corresponds to.
Thank you :)
Aaron
- Joseph Miller 23 Feb 2023 18:41 UTC
  1 point
  0
  Parent
  Hi!
  For each token prediction we record the activation of the neuron and whether on not ” an” has a greater logit than any other token (if it was the top prediction).
  We group the activations into buckets of width $0.2$ . For each bucket we plot
  $\frac{Number of times ‘ ‘ a n " was the top prediction (for activations in this bucket)}{Number of activations in this bucket}$
  Does that clarify things for you?