Posts
Comments
Comment by
Juraj Vitko (youurayy) on
What are the most important papers/post/resources to read to understand more of GPT-3? ·
2020-08-04T10:51:54.248Z ·
LW ·
GW
Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.
- https://github.com/jalammar/jalammar.github.io/blob/master/notebooks/nlp/01_Exploring_Word_Embeddings.ipynb
- http://www.peterbloem.nl/blog/transformers
- http://jalammar.github.io/illustrated-transformer/
- https://amaarora.github.io/2020/02/18/annotatedGPT2.html
- http://jalammar.github.io/illustrated-gpt2/
- http://jalammar.github.io/how-gpt3-works-visualizations-animations/
- https://arxiv.org/pdf/1409.0473.pdf Attention (initial)
- https://arxiv.org/pdf/1706.03762.pdf Attention Is All You Need
- http://nlp.seas.harvard.edu/2018/04/03/attention.html (annotated)
- https://www.arxiv-vanity.com/papers/1904.02679/ Visualizing Attention
- https://stats.stackexchange.com/questions/421935/what-exactly-are-keys-queries-and-values-in-attention-mechanisms
- https://arxiv.org/pdf/1807.03819.pdf Universal Transformers
- https://arxiv.org/pdf/2007.14062.pdf Big Bird (see appendices)
- https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_breaking_the_quadratic_attention_bottleneck_in/
- https://www.tensorflow.org/tutorials/text/transformer
- https://www.tensorflow.org/tutorials/text/nmt_with_attention
- https://cdn.openai.com/blocksparse/blocksparsepaper.pdf
- https://openai.com/blog/block-sparse-gpu-kernels/
- https://github.com/pbloem/former/blob/master/former/transformers.py
- https://github.com/openai/blocksparse/blob/master/examples/transformer/enwik8.py
- https://github.com/google/trax/blob/master/trax/models/transformer.py
- https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_gpt2.py