What are the most important papers/post/resources to read to understand more of GPT-3?

post by adamShimi · 2020-08-02T20:53:30.913Z · score: 25 (12 votes) · LW · GW · No comments

This is a question post.

Contents

  Answers
    13 Peter Jin
    7 Juraj Vitko
None
No comments

I'm way more used to thinking about weird maths or distributed algorithms or abstract philosophical problems than about concrete machine learning architectures. But based on everything I see about GPT-3, it seems a nice idea to learn more about it, even if only for participating in the discussion without spouting non-sense.

So I'm asking for what you think are the must-reads on GPT-3 specifically, and maybe any requirement to understand them.

Answers

answer by Peter Jin · 2020-08-03T01:26:49.643Z · score: 13 (8 votes) · LW(p) · GW(p)

nostalgebraist's blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here ("the transformer... 'explained'?"), which helps to contextualize GPT-x within the history of machine learning.

(Though, I should note that nostalgebraist holds a contrarian "bearish" position on GPT-3 in particular; for the "bullish" case instead, read Gwern.)

comment by adamShimi · 2020-08-03T18:39:01.546Z · score: 3 (2 votes) · LW(p) · GW(p)

Thanks for the answer! I knew about the "transformer explained" post, but I was not aware of its author's position on GPT-3.

answer by Juraj Vitko · 2020-08-04T10:51:54.248Z · score: 7 (3 votes) · LW(p) · GW(p)

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.

comment by adamShimi · 2020-08-10T15:03:22.134Z · score: 1 (1 votes) · LW(p) · GW(p)

Thanks! I'll try to read that.

No comments

Comments sorted by top scores.