Posts
Comments
Comment by
aaronsnoswell on
We Found An Neuron in GPT-2 ·
2023-02-23T01:49:17.956Z ·
LW ·
GW
Hello! A great write-up and fascinating investigation. Well done with such a great result from a hackathon.
I'm trying to understand your plot titled 'Proportion of Top Predictions that are " an" by Layer 31 Neuron 892 Activation'. Can you explain what the y-axis is in this plot? It's not clear what the y-axis is a proportion of.
I read through the code, but couldn't quite follow the logic for this plot. It seems that the y-axis is computed with these lines;
neuron_act_top_pred_proportions = [dict(sorted([(k / bin_granularity, v["top_pred"] / v["count"])
for k, v in logit_bins.items()])) for logit_bins in logit_diff_bins.values()]
But I'm not sure what the numerator v["count"]
from within logit_bins
corresponds to.
Thank you :)
Aaron