An Opinionated Evals Reading List

post by Marius Hobbhahn (marius-hobbhahn), Jérémy Scheurer (JerrySch) · 2024-10-15T14:38:58.778Z · LW · GW · 0 comments

This is a link post for https://www.apolloresearch.ai/blog/an-opinionated-evals-reading-list

Contents

  Our favorite papers
  Other evals-related publications
    LM agents
      Core:
      Other:
    Benchmarks
      Core:
      Other:
    Science of evals
      Core:
      Other:
    Software
      Core:
      Other:
    Miscellaneous
      Core:
      Other:
  Related papers from other fields
    Red teaming
      Core:
      Other:
    Scalable oversight
      Core:
      Other:
    Scaling laws & emergent behaviors
      Core:
      Other:
    Science tutorials
      Core:
      Other:
    LLM capabilities
      Core:
      Other:
    LLM steering
      Core:
      Other:
      Core:
      Other:
    Fairness, bias, and accountability
    AI Governance
      Core:
      Other:
  Contributions
None
No comments

While you can make a lot of progress in evals with tinkering and paying little attention to the literature, we found that various other papers have saved us many months of research effort. The Apollo Research evals team thus compiled a list of what we felt were important evals-related papers. We likely missed some relevant papers, and our recommendations reflect our personal opinions.

Our favorite papers

Other evals-related publications

LM agents

Core:

Other:

Benchmarks

Core:

 

Other:

Science of evals

Core:

 

Other:

Software

Core:

 

Other:

Miscellaneous

Core:

 

Other:

Related papers from other fields

Red teaming

Core:

 

Other:

Scalable oversight

Core:

 

Other:

Scaling laws & emergent behaviors

Core:

 

Other:

Science tutorials

Core:

 

Other:

LLM capabilities

Core:

 

Other:

LLM steering

RLHF

 

Core:

 

Other:

Supervised Finetuning/Training & Prompting

Core:

Other:

Fairness, bias, and accountability

AI Governance

Core: 

 

Other:

 

Contributions

The first draft of the list was based on a combination of various other reading lists that Marius Hobbhahn and Jérémy Scheurer had previously written. Marius wrote most of the final draft with detailed input from Jérémy and high-level input from Mikita Balesni, Rusheb Shah, and Alex Meinke. 

0 comments

Comments sorted by top scores.