Levelling Up in AI Safety Research Engineering

post by Gabe M (gabe-mukobi) · 2022-09-02T04:59:42.699Z · LW · GW · 9 comments

Contents

  Introduction
  Level 1: AI Safety Fundamentals
  Level 2: Software Engineering
  Level 3: Machine Learning
  Level 4: Deep Learning
  Level 5: Understanding Transformers
  Level 6: Reimplementing Papers
  Level 7: Original Experiments
  Epilogue: Risks
    Capabilities
    Difficulty
    Early Over-Specialization
    Late Over-Generalization
  Sources
None
9 comments

Summary: A level-based guide for independently up-skilling in AI Safety Research Engineering that aims to give concrete objectives, goals, and resources to help anyone go from zero to hero.

Cross-posted to the EA Forum [EA · GW]. View a pretty Google Docs version here.

Introduction

I think great [EA · GW] career guides are really useful for guiding and structuring the learning journey of people new to a technical field like AI Safety. I also like role-playing games. Here’s my attempt to use levelling frameworks and break up one possible path from zero to hero in Research Engineering for AI Safety (e.g. jobs with the “Research Engineer” title) through objectives, concrete goals, and resources. I hope this kind of framework makes it easier to see where one is on this journey, how far they have to go, and some options to get there.

I’m mostly making this to sort out my own thoughts about my career development and how I’ll support other students through Stanford AI Alignment [EA · GW], but hopefully, this is also useful to others! Note that I assume some interest in AI Safety Research Engineering—this guide is about how to up-skill in Research Engineering, not why (though working through it should be a great way to test your fit). Also note that there isn’t much abstract advice in this guide (see the end for links to guides with advice), and the goal is more to lay out concrete steps you can take to improve.

For each level, I describe the general capabilities of someone at the end of that level, some object-level goals to measure that capability, and some resources to choose from that would help get there. The categories of resources within a level are listed in the order you should progress, and resources within a category are roughly ordered by quality. There’s some redundancy, so I would recommend picking and choosing between the resources rather than doing all of them. Also, if you are a student and your university has a good class on one of the below topics, consider taking that instead of one of the online courses I listed.

As a very rough estimate, I think each level should take at least 100-200 hours of focused work, for a total of 700-1400 hours. At 10 hours/week (quarter-time), that comes to around 16-32 months of study but can definitely be shorter (e.g. if you already have some experience) or longer (if you dive more deeply into some topics)! I think each level is about evenly split between time spent reading/watching and time spent building/testing, with more reading earlier on and more building later.

Confidence: mid-to-high. I am not yet an AI Safety Research Engineer (but I plan to be)—this is mostly a distillation of what I’ve read from other career guides (linked at the end) and talked about with people working on AI Safety. I definitely haven’t done all these things, just seen them recommended. I don’t expect this to be the “perfect” way to prepare for a career in AI Safety Research Engineering, but I do think it’s a very solid way. 

Level 1: AI Safety Fundamentals

Objective‏‏‎ ‎

You are familiar with the basic arguments for existential risks due to advanced AI, models for forecasting AI advancements, and some of the past and current research directions within AI alignment/safety. You have an opinion on how much you buy these arguments and whether you want to keep exploring AI Safety Research Engineering.

Why this first? Exposing yourself to these fundamental arguments and ideas is useful for testing your fit for AI Safety generally, but that isn’t to say you should “finish” this Level first and move on. Rather, you should be coming back to these readings and keeping up to date with the latest work in AI Safety throughout your learning journey. It’s okay if you don’t understand everything on your first try—Level 1 kind of happens all the time.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. AI Safety Reading Curriculum (Choose 1)
    1. 2022 AGI Safety Fundamentals alignment curriculum - EA Cambridge
    2. CUEA Standard AI Safety Reading Group Syllabus - Columbia EA
    3. Intro to ML Safety Course - CAIS
    4. AGI safety from first principles - Richard Ngo [? · GW]
  2. Additional Resources
    1. AI Alignment Forum
    2. Alignment Newsletter - Rohin Shah
    3. The Alignment Problem - Brian Christian

Level 2: Software Engineering

Objective‏‏‎ ‎

You can program in Python at the level of an introductory university course. You also know some other general software engineering tools/skills like the command line, Git/GitHub, documentation, and unit testing.

Why Python? Modern Machine Learning work, and thus AI Safety work, is almost entirely written in Python. Python is also an easier language for beginners to pick up, and there are plenty of resources for learning it.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. Python Programming (Choose 1-2)
    1. www.learnpython.org
    2. Learn Python 3 - Codecademy
    3. Scientific Computing with Python Certification - freeCodeCamp
    4. CS50's Introduction to Programming with Python - Harvard University
  2. Scientific Python (Choose 1-2)
    1. Data Analysis with Python Certification - freeCodeCamp
    2. Learn Python for Data Science, Structures, Algorithms, Interviews - Udemy
  3. Command Line (Choose 1-3)
    1. Learning the shell - LinuxCommand.org
    2. Linux Command Line Basics - Udacity
    3. The Unix Shell - software carpentries
  4. Git/GitHub (Choose 2+)
    1. GitHub Tutorial - Beginner's Training Guide - Anson Alexander
    2. Git Immersion
    3. GitHub Skills
    4. Version Control with Git - software carpentries
    5. first-contributions - contribute to open source projects
  5. Documentation (Choose 1-2)
    1. Documenting Python Code: A Complete Guide - Real Python
    2. Documenting Python Code: How to Guide - DataCamp
  6. Unit Testing (Choose 1-3)
    1. Getting Started With Testing in Python - Real Python
    2. A Gentle Introduction to Unit Testing in Python - Machine Learning Mastery
    3. Unit Testing in Python Tutorial - DataCamp
  7. Additional Resources
    1. Things I Wish Someone Had Told Me When I Was Learning How to Code
    2. The Hitchhiker’s Guide to Python!
    3. Cracking the Coding Interview - Gayle Laakmann McDowell
    4. Online Programming Learning Platform - LeetCode
    5. Challenging mathematical/computer programming problems - Project Euler
    6. 100_Numpy_exercises - rougier
    7. The Good Research Code Handbook — Good Research Code

Level 3: Machine Learning

Objective‏‏‎ ‎

You have the mathematical context necessary for understanding Machine Learning (ML). You know the differences between supervised and unsupervised learning and between classification and regression. You understand common models like linear regression, logistic regression, neural networks, decision trees, and clustering, and you can code some of them in a library like PyTorch or JAX. You grasp core ML concepts like loss functions, regularization, bias/variance, optimizers, metrics, and error analysis.

Why so much math? Machine learning at its core is basically applied statistics and multivariable calculus. It used to be that you needed to know this kind of math really well, but now with techniques like automatic differentiation, you can train neural networks without knowing much of what’s happening under the hood. These foundational resources are included for completeness, but you can probably spend a lot less time on math (e.g. the first few sections of each course) depending on what kind of engineering work you intend to do. You might want to come back and improve you math skills for understanding certain work in Levels 6-7, though, and if you find this math really interesting, you might be a good fit for theoretical AI alignment research.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. Basic Calculus (Choose 1)
    1. Essence of calculus -  3Blue1Brown
  2. Probability (Choose 1)
    1. Probability - The Science of Uncertainty and Data - MIT
    2. Introduction to Probability - Harvard
    3. Part I: The Fundamentals | Introduction to Probability - MIT OpenCourseWare
  3. Linear Algebra (Choose 1)
    1. Essence of linear algebra - 3Blue1Brown
    2. Linear Algebra - MIT OpenCourseWare
    3. Georgia Tech’s course (parts 1234)
    4. Linear Algebra - Foundations to Frontiers - edX
    5. Linear Algebra Done Right - Sheldon Axler
  4. Multivariable Calculus (Choose 1)
    1. Multivariable Calculus - MIT OpenCourseWare
    2. Multivariable Calculus - Khan Academy
    3. The Matrix Calculus You Need For Deep Learning - explained.ai
    4. Mathematics for Machine Learning - Imperial College London
  5. Introductory Machine Learning (Choose 1-2)
    1. Course 6.036 - MIT Open Learning Library
    2. Machine Learning by Stanford University - Coursera
    3. Introduction to Machine Learning - Udacity
    4. Introduction to Machine Learning - Google
    5. Advanced Introduction to Machine Learning - CMU
    6. Supervised Machine Learning: Regression and Classification- DeepLearning.AI
  6. Additional Resources
    1. pytorch_exercises - Kyubyong
    2. Writing better code with pytorch+einops - einops
    3. Contemporary ML for Physicists - Jared Kaplan

Level 4: Deep Learning

Objective‏‏‎ ‎

You’ve dived deeper into Deep Learning (DL) through the lens of at least one subfield such as Natural Language Processing (NLP), Computer Vision (CV), or Reinforcement Learning (RL). You now have a better understanding of ML fundamentals, and you’ve reimplemented some core ML algorithms “from scratch.” You’ve started to build a portfolio of DL projects you can show others.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. General Deep Learning (Choose 1)
    1. Practical Deep Learning for Coders - fast.ai
    2. Deep Learning by deeplearning.ai - Coursera
    3. Deep Learning - NYU
    4. PyTorch Tutorials
    5. Deep Learning for Computer Vision - UMich (Lectures 1-13 only)
    6. Deep Learning Online Training Course - Udacity
    7. Neural networks and deep learning - Michael Nielsen
  2. Advanced Machine Learning
    1. Studying (Choose 1-2)
      1. Backpropagation - CS231n Convolutional Neural Networks for Visual Recognition
      2. A Recipe for Training Neural Networks - Andrej Karpathy
    2. Implementing (Choose 1)
      1. MiniTorch (reimplement the core of PyTorch, self-study tips here)
      2. building micrograd - Andrej Karpathy
      3. Autodidax: JAX core from scratch
  3. Natural Language Processing (Choose 1 Or Another Sub-Field)
    1. Stanford CS 224N | Natural Language Processing with Deep Learning (lecture videos here)
    2. CS224U: Natural Language Understanding - Stanford University
    3. Week 12 Lecture: Deep Learning for Natural Language Processing (NLP) - NYU
    4. A Code-First Introduction to Natural Language Processing - fast.ai
  4. Computer Vision (Choose 1 Or Another Sub-Field)
    1. Deep Learning for Computer Vision - UMich
    2. CS231n: Deep Learning for Computer Vision - Stanford University (lecture videos here)
  5. Reinforcement Learning (Choose 1 Or Another Sub-Field)
    1. Spinning Up in Deep RL - OpenAI
    2. Deep Reinforcement Learning Class - Hugging Face
    3. Deep Reinforcement Learning: CS 285 Fall 2021 - UC Berkeley
    4. Deep Reinforcement Learning: Pong from Pixels - Andrej Karpathy
    5. DeepMind x UCL | Deep Learning Lecture Series 2021 - DeepMind
    6. Reinforcement Learning - University of Alberta
  6. Additional Resources
    1. Neural networks - 3Blue1Brown
    2. Cloud Computing Basics (Cloud 101) - Coursera
    3. Learn Cloud Computing with Online Courses, Classes, & Lessons - edX

Level 5: Understanding Transformers

Objective‏‏‎ ‎

You have a principled understanding of self-attention, cross-attention, and the general transformer architecture along with some of its variants. You are able to write a transformer like BERT or GPT-2 “from scratch” in PyTorch or JAX (a skill I believe Redwood Research looks for), and you can use resources like 🤗 Transformers to work with pre-trained transformer models. Through experimenting with deployed transformer models, you have a decent sense of what transformer-based language and vision models can and cannot do.

Why transformers? The transformer architecture is currently the foundation for State of the Art (SOTA) results on most deep learning benchmarks, and it doesn’t look like it’s going away soon. Much of the newest ML research involves transformers, so AI Safety organizations focused on prosaic AI alignment [AF · GW] or conducting research on current models practically all focus on transformers for their research.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. Experiment With Deployed Transformers (Choose 1-3)
    1. OpenAI Playground (GPT-3) - OpenAI
    2. Elicit: The AI Research Assistant - Ought
    3. DALL·E 2 - OpenAIStable Diffusion, or Craiyon (see The DALL·E 2 prompt book)
    4. Codex - OpenAI
  2. Study The Transformer Architecture (Choose 2-3)
    1. Attention Is All You Need - Vaswani et al. (Sections  1-3)
    2. Lectures 8, 9, and optionally 10 from CS224N - Stanford University
    3. The Illustrated Transformer - Jay Alammar
    4. Formal Algorithms for Transformers - DeepMind
    5. The Illustrated GPT-2 - Jay Alammar
    6. the transformer ... “explained”? - nostalgebraist
    7. The Annotated Transformer - Harvard
  3. Using 🤗 Transformers (Choose 1-2)
    1. Hugging Face Course
    2. CS224U: Natural Language Understanding - Stanford University (Supervised Sentiment Analysis unit only)
  4. Implement Transformers From Scratch (Choose 1-2)
    1. MLAB-Transformers-From-Scratch - Redwood Research (refactored by Gabriel Mukobi)
    2. deep_learning_curriculum/1-Transformers - Jacob Hilton
  5. Compare Your Code With Other Implementations
    1. BERT (Choose 1-3)
      1. pytorchic-bert/models.py - dhlee347 (PyTorch)
      2. BERT - Google Research (TensorFlow)
      3. How to Code BERT Using PyTorch - neptune.ai (PyTorch)
      4. nlp-tutorial/BERT.py - graykode (PyTorch)
      5. Transformer-Architectures-From-Scratch/BERT.py  - ShivamRajSharma (PyTorch)
    2. GPT-2 (Choose 1-3)
      1. Transformer-Architectures-From-Scratch/GPT_2.py - ShivamRajSharma (PyTorch)
      2. gpt-2/model.py - openai (TensorFlow)
      3. minGPT/model.py - Andrej Karpathy (PyTorch)
      4. The Annotated GPT-2 - Aman Arora (PyTorch)
  6. Additional Resources
    1. Study Transformers More
      1. How to sample from language models - Ben Mann
      2. Neural Scaling Laws and GPT-3 - Jared Kaplan - OpenAI
      3. Transformer-Models-from-Scratch - Hongbin Chen (PyTorch)
    2. Other Transformer Models You Could Implement
      1. Original Encoder-Decoder Transformer (impl. Transformers from scratch - Peter Bloem)
      2. ViTPERFORMER (impl. Transformer-Architectures-From-Scratch - ShivamRajSharma)
      3. CLIP: Connecting Text and Images - OpenAI (impl. openai/CLIP)

Level 6: Reimplementing Papers

Objective‏‏‎ ‎

You can read a recently published AI research paper and efficiently implement the core technique they present to validate their results or build upon their research. You also have a good sense of the latest ML/DL/AI Safety research. You’re pretty damn employable now—if you haven’t started applying for Research Engineering jobs/internships, consider getting on that!

Why papers? I talked with research scientists or engineers from most of the empirical AI Safety organizations (i.e. Redwood Research, Anthropic, Conjecture, Ought, CAIS, Encultured AI, DeepMind), and they all said that being able to read a recent ML/AI research paper and efficiently implement it is both a signal of a strong engineering candidate and a good way to build useful skills for actual AI Safety work.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. How to Read Computer Science Papers (Choose 1-3)
    1. How to Read a Paper - S. Keshav
    2. How to Read Research Papers: A Pragmatic Approach for ML Practitioners  - NVIDIA
    3. Career Advice / Reading Research Papers - Stanford CS230: Deep Learning - Andrew Ng
  2. How to Implement Papers (Choose 2-4)
    1. Lessons Learned Reproducing a Deep Reinforcement Learning Paper - Amid Fish
    2. Advice on paper replication - Richard Ngo [EA · GW]
    3. ML engineering for AI Safety & robustness: a Google Brain engineer's guide to entering the field - 80,000 Hours
    4. A Recipe for Training Neural Networks - Andrej Karpathy
  3. Implement Papers (Choose 5+, look beyond these)
    1. General Lists
      1. Machine Learning Reading List - Ought
      2. Some fun machine learning engineering projects that I would think are cool - Buck Shlegeris
    2. Interpretability
      1. Thread: Circuits - Olah et al.
      2. A Survey on Neural Network Interpretability - Zhang et al.
      3. Post-hoc Interpretability for Neural NLP: A Survey - Madsen et al.
      4. Locating and Editing Factual Associations in GPT (ROME) - Meng et al.
    3. Robustness/Anomaly Detection
      1. Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift - Baek et al. 
    4. Value/Preference Learning
      1. Deep Learning from Human Preferences - OpenAI
      2. Fine-Tuning Language Models from Human Preferences - OpenAI
    5. Reinforcement Learning
      1. Key Papers in Deep RL — OpenAI

Level 7: Original Experiments

Objective‏‏‎ ‎

You can now efficiently grasp the results of AI research papers and come up with novel research questions to ask as well as empirical ways to answer them. You might already have a job at an AI Safety organization and have picked up these skills as you got more Research Engineering experience. If you can generate and test these original experiments particularly well, you might consider Research Scientist roles, too. You might also want to apply for AI residencies or Ph.D. programs to explore some research directions further in a more structured academic setting.

Goals‏‏‎ ‎

Resources‏‏‎ ‎

  1. Research Advice
    1. Research Taste Exercises - Chris Olah
    2. How I Formed My Own Views About AI Safety - Neel Nanda
    3. An Opinionated Guide to ML Research - John Schulman
    4. Personal Rules of Productive Research -  Eugene Vinitsky
  2. General Lists of Open Questions to Start Researching
    1. AI Safety Ideas - Apart Research
    2. Open Problems in AI X-Risk - CAIS [AF · GW]
    3. Random, Assorted AI Safety Ideas - Evan Hubinger [AF(p) · GW(p)]
    4. Some conceptual alignment research projects - Richard Ngo [LW · GW]
  3. Open Questions in Interpretability
    1. Ten experiments in modularity, which we'd like you to run! - TheMcDouglas, Lucius Bushnaq, Avery [LW · GW]
    2. A Mechanistic Interpretability Analysis of Grokking#Future Directions - Neel Nanda, Tom Lieberum [AF · GW]
  4. Open Questions in Robustness/Anomaly Detection
    1. Benchmark for successful concept extrapolation/avoiding goal misgeneralization - AlignedAI [LW · GW]
    2. Neural Trojan Attacks and How You Can Help - Sidney Hough
  5. Open Questions in Adversarial Training
    1. Final Project Guidelines- MLSS

Epilogue: Risks

Embarking on this quest brings with it a few risks. By keeping these in mind, you may be less likely to fail in these ways:

Capabilities

Difficulty

Early Over-Specialization

Late Over-Generalization

Sources

Here are some of the other great career guides and resources I used in the making of this. Most of the guides here also have good general advice that would be useful to read even if you don’t do the other things they suggest. Consider checking them out!

Many thanks to Jakub Nowak, Peter Chatain, Thomas Woodside, Erik Jenner, Jacy Reese Anthis, and Konstantin Pilz for review and suggestions!

9 comments

Comments sorted by top scores.

comment by Yonatan Cale (yonatan-cale-1) · 2022-09-10T18:47:07.846Z · LW(p) · GW(p)

This looks like a guide for [working in a company that already has a research agenda, and doing engineering work for them based on what they ask for] and not for [trying to come up with a new research direction that is better than what everyone else is doing], right?

Replies from: gabe-mukobi
comment by Gabe M (gabe-mukobi) · 2022-09-12T01:03:55.688Z · LW(p) · GW(p)

Mostly, yes, that's right. The exception is in Level 7: Original Experiments which suggests several resources for forming an inside view and coming up with new research directions, but I think many people could get hired as research engineers before doing that stuff (though maybe they do that stuff while working as a research engineer and that leads them to come up with new better research directions).

comment by zeshen · 2022-09-02T10:16:12.630Z · LW(p) · GW(p)

This is a great guide - thank you. However, in my experience as someone completely new to the field, 100-200 hours on each level is very optimistic. I've easily spent double/triple the duration on the first two levels and not get to a comfortable level. 

Replies from: gabe-mukobi
comment by Gabe M (gabe-mukobi) · 2022-09-02T16:37:38.191Z · LW(p) · GW(p)

Thanks, yeah that's a pretty fair sentiment. I've changed the wording to "at least 100-200 hours," but I guess the idea was more to present a very efficient way of learning things that maybe 80/20's some of the material. This does mean there will be more to learn—rather than these being strictly linear progression levels, I imagine someone continuously coming back to AI safety readings and software/ML engineering skills often throughout their journey, as it sounds like you have.

comment by Rohin Shah (rohinmshah) · 2022-09-03T14:33:43.031Z · LW(p) · GW(p)

I'd have Level 1 (AI Safety Fundamentals) be Level 4 or 5, probably. I'm pretty happy to hire engineers who have good ML skills but are rusty on "AI safety fundamentals"; I think they can pick that up on the job much more easily than the coding / ML skills.

Replies from: gabe-mukobi, Jay Bailey
comment by Gabe M (gabe-mukobi) · 2022-09-03T16:00:09.386Z · LW(p) · GW(p)

Interesting, that is the level that feels most like it doesn't have a solid place in a linear progression of skills. I wrote "Level 1 kind of happens all the time" to try to reflect this, but I ultimately decided to put it at the start because I feel that for people just starting out it can be a good way to test their fit for AI safety broadly (do they buy the arguments?) and decide whether they want to go down a more theoretical or empirical path. I just added some language to Level 1 to clarify this.

comment by Jay Bailey · 2022-09-03T15:59:37.991Z · LW(p) · GW(p)

My understanding is that Level 1 is supposed to happen in parallel with the others, but this might be clearer by separating it outside the numbered system entirely, like Level 0 or Level -1 or something.

As for why it's included as the first step, I think the reasoning is that if someone knows nothing about AI safety, their first question is going to be "Should I actually care about this problem", and to answer that question they do need to do a little bit of AI safety specific reading.

I agree this could be made clearer - one of the first bits of advice I got when I started asking around about this stuff was "Technical ML ability is harder and takes longer to learn than AI safety knowledge does, so spend most of your time on this as opposed to AI safety" and I remember this being a very unintuitive insight.

comment by Zian · 2022-12-26T03:52:08.182Z · LW(p) · GW(p)

I recognize many of the institutions you mentioned such as Nvidia and MIT. How confident are you that the more obscure ones like Hugging Face are trustworthy?

Replies from: gabe-mukobi
comment by Gabe M (gabe-mukobi) · 2022-12-26T04:03:24.039Z · LW(p) · GW(p)

I'm not sure what you particularly mean by trustworthy. If you mean a place with good attitudes and practices towards existential AI safety, then I'm not sure HF has demonstrated that.

If you mean a company I can instrumentally trust to build and host tools that make it easy to work with large transformer models, then yes, it seems like HF pretty much has a monopoly on that for the moment, and it's worth using their tools for a lot of empirical AI safety research.