How well can the GPT architecture solve the parity task?
post by FactorialCode · 2020-07-11T19:02:07.730Z · LW · GW · 1 commentThis is a question post.
Contents
Answers 26 gwern None 1 comment
Suppose I give it pairs of strings and ask it to output 1 if the the number of 1s in the string is even and zero if it's odd.
e. g.
0 -> 0
1 -> 1
11 -> 0
101 -> 0
1101-> 1
10101001 -> 0
111000101110 -> 1
How well does it do on this task? What if we finetune it on sample data?
Answers
It does not, sad to say. I tried space-separating each digit for the BPE issue, and its general completion is to just copy the previous line. The log probs of the possible completions are generally 50:50 for 0/1, showing it's not tapping into any parity counting.
↑ comment by gwern · 2020-07-20T21:39:40.534Z · LW(p) · GW(p)
One interesting update: we've been increasingly unlocking GPT-3 solutions by rewriting them as multi-step procedures. So parity might be doable by somewhat cheating and writing out a series of steps for computing the parity for each example: https://twitter.com/bucketofkets/status/1285100951271952384 https://twitter.com/Malcolm_Ocean/status/1285099206781341696
1 comment
Comments sorted by top scores.
comment by Gurkenglas · 2020-07-11T19:34:48.902Z · LW(p) · GW(p)
If you try this, reformat to work around the BPE problem as detailed in https://www.gwern.net/GPT-3#bpes