What kinds of algorithms do multi-human imitators learn?

post by Chris van Merwijk (chrisvm), Joar Skalse (Logical_Lunatic) · 2022-05-22T14:27:31.430Z · LW · GW · 0 comments

Contents

No comments

epistemic status: Speculation. The actual proposals are idealized, not meant to be exactly right. We have thought about this for less than an hour.


In this earlier post [LW · GW] I stated a speculative hypothesis about the algorithm that a single imitator that imitates collections of multiple humans would learn. Here Joar Skalse joined me and we made a list of some more hypotheses, all very speculative and probably each individually wrong. 

The point is that if we have an imitator that imitates a single human’s text, we might (very dubiously) expect that imitator to learn basically a copy of that human. What would an imitator learn who is trained to imitate content generated by vast collections of humans? We can then ask: what are the implications for how it generalizes and what you can get with finetuning?

Here are our set of idealized and almost certainly not exactly correct hypotheses:


 

0 comments

Comments sorted by top scores.