A brainteaser for language models
post by Adam Scherlis (adam-scherlis) · 2022-12-12T02:43:53.410Z · LW · GW · 3 commentsContents
3 comments
I came up with the following puzzle the other day:
Q: Solve the puzzle: 63 = x = 65536
A: x =
The intended answer is in the form of a number.
text-davinci-003
guesses my intended answer at 11.8% probability, which is the second-highest probability for any answer.
(This is somewhat cherry-picked; small changes to the phrasing give worse results. ChatGPT gave the intended answer the third time I asked it, but this appears to have been dumb luck. The true rate for ChatGPT is probably below 10%, and maybe below 5%.)
So far, friends have found it fairly difficult. About two dozen people made at least one guess, and at least six spent a while on it. So far, two people have figured it out, in both cases after being told that GPT-3.5 could do it.
You may want to try to solve it yourself before reading on.
Here are some progressively-stronger hints:
The answer is a string of ordinary decimal digits, it's not "NaN" or "/" or "63; 65536".
The fact that 63 = 2^6 - 1 is a red herring and has no relevance. Sorry.
Number bases other than decimal are not relevant.
The answer depends on the digit strings in base ten as well as the value of the numbers.
The answer depends on an accidental feature of some of GPT's training data.
The accidental feature is a formatting bug.
Here's the answer without explanation:
x = 216
And here's the explanation:
and
Why can GPT-3.5 solve this? My guess:
Superscripts are often lost when rich text is converted to plain text, producing sentences like "the Sun's mass is about 1030 kilograms". This problem affects a lot of GPT-3's training data.
It appears that GPT is somewhat convinced, on some level, that "63 = 216" and "216 = 65536" are plausible facts.
Some successful attempts to double-check this understanding:
Replacing 63 with nearby numbers causes GPT to give much lower probabilities for the answer "216". (Except for 62, which gives similar results to 63...)
Numbers near 216 get dramatically (20x-1000x) lower probabilities than 216 does.
Even without "= 65536", 216 is favored ~100x over nearby numbers.
Some wrinkles / contrary evidence:
The top result, above "216", is the string "2^16 = 65536" -- but this is still the top output after changing 63 to other numbers, so it doesn't seem to be inspired by the number 216.
GPT gives the token "216" in the string "63 = 216" a very low probability, just as low as "215" or "217".
Replacing "63" with "62" in the prompt still gives "216" as an output with ~10% probability.
As mentioned above, this doesn't work as well if you tweak the prompt in irrelevant ways, or ask a different model.
Thanks to everyone who tried to solve my puzzle, congrats to Anton and Eli for solving it, and thanks to Georgia Ray for making the joke that inspired the puzzle.
3 comments
Comments sorted by top scores.
comment by LawrenceC (LawChan) · 2022-12-12T06:18:21.961Z · LW(p) · GW(p)
Here's additional evidence for your guess:
textdavinci-003
completes There are 210 byte in a kilobyte. That means there are
with 1,024
~65.7% of the time.textdavinci-003
completes There are about 8x109 people on earth. This implies that there are
with approximately 8 billion
~57.1% of the time.
comment by TekhneMakre · 2022-12-12T06:00:04.840Z · LW(p) · GW(p)
GPT gives the token "216" in the string "63 = 216" a very low probability, just as low as "215" or "217".
Replacing "63" with "62" in the prompt still gives "216" as an output with ~10% probability.
Would the tokenizer behave differently given "216" and "2^16", e.g. giving respectively the token "216" and some tokens like "**2" and "16*"? That would explain this as, GPT knows of course that 216 isn't 63, but, it's been forced to predict a relationship like "**2" + "16*" = "**63*".
Replies from: LawChan↑ comment by LawrenceC (LawChan) · 2022-12-12T06:10:51.852Z · LW(p) · GW(p)
The Codex tokenizer used by the GPT-3.5 models tokenizes them differently: "216" is 1 token, "2^16" is 3 ("2", "^", "16"). Note that " 216" (with a space) is a different token, and it's what text-davinci-003
actually really wants to predict (you'll often see 100x probability ratios between these two tokens).
Here's the log probs of the two sequences using Adam's prompt above, with the trailing space removed (which is what he did in his actual setup, otherwise you get different probabilities): 2 16
-> -15.91
2^16
-> -1.34