Posts

Comments

Comment by aaronsnoswell on We Found An Neuron in GPT-2 · 2023-02-23T01:49:17.956Z · LW · GW

Hello! A great write-up and fascinating investigation. Well done with such a great result from a hackathon.

I'm trying to understand your plot titled 'Proportion of Top Predictions that are " an" by Layer 31 Neuron 892 Activation'. Can you explain what the y-axis is in this plot? It's not clear what the y-axis is a proportion of.

I read through the code, but couldn't quite follow the logic for this plot. It seems that the y-axis is computed with these lines;

neuron_act_top_pred_proportions = [dict(sorted([(k / bin_granularity, v["top_pred"] / v["count"])
                                                for k, v in logit_bins.items()])) for logit_bins in logit_diff_bins.values()]

But I'm not sure what the numerator v["count"] from within logit_bins corresponds to.

Thank you :)

Aaron