Machine Learning Projects on IDA
post by Owain_Evans, William_S, stuhlmueller · 20190624T18:38:18.873Z · score: 51 (18 votes) · LW · GW · 3 commentsContents
TLDR What is IDA? ML Projects on IDA Project 1: Amplifying Mathematical Reasoning Project 2: IDA for Neural Program Interpretation Project 3: Adaptive Computation None 3 comments
TLDR
We wrote a 20page document that explains IDA and outlines potential Machine Learning projects about IDA. This post gives an overview of the document.
What is IDA?
Iterated Distillation and Amplification (IDA) is a method for training ML systems to solve challenging tasks. It was introduced by Paul Christiano. IDA is intended for tasks where:

The goal is to outperform humans at the task or to solve instances that are too hard for humans.

It is not feasible to provide demonstrations or reward signals sufficient for superhuman performance at the task

Humans have a highlevel understanding of how to approach the task and can reliably solve easy instances.
The idea behind IDA is to bootstrap using an approach similar to AlphaZero, but with a learned model of steps of human reasoning instead of the fixed game simulator.
Our document provides a selfcontained technical description of IDA. For broader discussion of IDA and its relevance to value alignment, see Ought's presentation, Christiano's blogpost, and the Debate paper. There is also a technical ML paper applying IDA to algorithmic problems (e.g. shortest path in a graph).
ML Projects on IDA
Our document outlines three Machine Learning projects on IDA. Our goal in outlining these projects is to generate discussion and encourage research on IDA. We are not (as of June 2019) working on these projects, but we are interested in collaboration. The project descriptions are “highlevel” and leave many choices undetermined. If you took on a project, part of the work would be refining the project and fixing a concrete objective, dataset and model.
Project 1: Amplifying Mathematical Reasoning
This project is about applying IDA to problems in mathematics. This would involve learning to solve math problems by breaking them down into easier subproblems. The problems could be represented in a formal language (as in this paper) or in natural language. We discuss a recent dataset of highschool problems in natural language, which was introduced in this paper. Here are some examples from the dataset:
Question: Let u(n) = n^3  n^2. Let e(c) = 2*c^3 + c. Let f(j) = 118*e(j) + 54*u(j). What is the derivative of f(a)?
Answer: 546*a^2  108*a  118
Question: Three letters picked without replacement from qqqkkklkqkkk. Give probability of sequence qql.
Answer: 1/110
The paper showed impressive results on the dataset for a Transformer model trained by supervised learning (sequencetosequence). This suggests that a similar model could do well at learning to solve these problems by decomposition.
Project 2: IDA for Neural Program Interpretation
There’s a research program in Machine Learning on “Neural Program Interpretation” (NPI). Work on NPI focuses on learning to reproduce the behavior of computer programs. One possible approach is to train endtoend on inputoutput behavior. However in NPI, a model is trained to mimic the program’s internal behavior, including all the lowlevel operations and the highlevel procedures which invoke them.
NPI has some similar motivations to IDA. This project applies IDA to the kinds of tasks explored in NPI and compares IDA to existing approaches. Tasks could include standard algorithms (e.g. sorting), algorithms that operate with databases, and algorithms that operate on humanreadable inputs (e.g. text, images).
Project 3: Adaptive Computation
The idea of “adaptive computation” is to vary the amount of computation you perform for different inputs. You want to apply more computation to inputs that are hard but solvable.
Adaptive computation seems important for the kinds of problems IDA is intended to solve, including some of the problems in Projects 1 and 2. This project would investigate different approaches to adaptive computation for IDA. The basic idea is to decide whether to rely only on the distilled model (which is fast but approximate) or to additionally use amplification (which is more accurate but slower). This decision could be based on a calibrated model or based on a learned policy for choosing whether to use amplification.
3 comments
Comments sorted by top scores.
I'm excited about this. If you get any substantive feedback from people who take on these projects or decide not to, I'd be very interested to see a followup post.
Regarding the following passage from the document:
What kind of builtin operations and environments should we use?
In existing work on NPI, the neural net is given outputs that correspond to basic operations on data. This makes it easier to learn algorithms that depend on those basic operations. For IDA, it would be ideal to learn these operations from examples. (If we were learning from human decompositions, we might not know about these “basic operations on data” ahead of time).
Do you have ideas/intuitions about how "basic operations" in the human brain can be learned? Also, how basic are the "basic operations" you're thinking about here? (Are we talking about something like the activity of an individual biological neuron? How active is a particular area in the prefrontal cortex? Symboliclevel stuff?)
Generally, do you consider imitating human cognition at the level of "basic operations" to be part of the IDA agenda? (As opposed to, say, training a model to "directly" predict the output of a humanworkingfor10minutes).