Getting started with AI Alignment research: how to reproduce an experiment from research paper

post by Alexander230 · 2024-05-30T14:51:49.337Z · LW · GW · 0 comments

Contents

  Setup
  Running experiment with LLMs
  Running experiment with vision models
None
No comments

This is a post with technical instructions, how to reproduce an experiment from Weak-to-strong generalization paper: https://openai.com/index/weak-to-strong-generalization/. It’s oriented mostly on beginners in AI Alignment who want to start tinkering with models and looking for examples how to do experiments.

Weak-to-strong generalization is research that shows that a strong model can learn on data generated by a weaker model, generalize the data and surpass the weaker model in the task for which it was trained. The paper comes with example code on GitHub with experiments both on LLMs and vision models. However, running the experiments from this code is not a straightforward task, so here are detailed instructions how to do it.

Setup

git clone https://github.com/openai/weak-to-strong
cd weak-to-strong
apt-get install tmux
tmux
pip install .
pip install matplotlib seaborn tiktoken fire einops scipy

Running experiment with LLMs

Now everything is ready to run an experiment with LLMs. The code was probably written for older versions of libraries, and it will end with error if run on new versions as is, but it can be easily fixed.

python sweep.py --model_sizes=gpt2,gpt2-medium
gpt2: 0.65
gpt2-medium: 0.699
weak gpt2 to strong gpt2: 0.652
weak gpt2 to strong gpt2-medium: 0.655
weak gpt2-medium to strong gpt2-medium: 0.689
python sweep.py --model_sizes=gpt2,gpt2-medium,gpt2-large,Qwen/Qwen-1_8B
RESULTS_PATH = "/tmp/results/default"
MODELS_TO_PLOT = ["gpt2", "gpt2-medium", "gpt2-large", "Qwen/Qwen-1_8B"]

Running experiment with vision models

You can also reproduce the experiment with vision models. For this, you will need to download some of the datasets manually.

WORKDIR=`pwd`
cd ~
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_devkit_t12.tar.gz
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar —no-check-certificate
cd $WORKDIR/vision
python run_weak_strong.py --strong_model_name resnet50_dino --n_epochs 20
Weak label accuracy: 0.566
Weak_Strong accuracy: 0.618
Strong accuracy: 0.644

When you get all scripts working and producing measurements and charts, you can use them later as examples of how to make your own experiments. Happy tinkering!

0 comments

Comments sorted by top scores.