[Linkpost] Scaling Laws for Generative Mixed-Modal Language Models

post by Amal (asta-vista) · 2023-01-12T14:24:00.921Z · LW · GW · 2 comments

This is a link post for https://arxiv.org/pdf/2301.03728.pdf

In this paper authors explore the scaling properties of mixed-modal generative models, discovering new scaling laws that unify the contributions of individual modalities and the interactions between them. I find most interesting that they have found so-called competition barrier - when training with multiple modalities, after a certain number of parameters/data, the loss is smaller than if the modalities were trained independently. This seems to predict cross-modal transfer that was sought after but not found (yet) with GATO. 

2 comments

Comments sorted by top scores.

comment by Quintin Pope (quintin-pope) · 2023-01-12T14:28:29.032Z · LW(p) · GW(p)

Your link seems broken.

Replies from: asta-vista
comment by Amal (asta-vista) · 2023-01-12T14:34:06.594Z · LW(p) · GW(p)

it is fixed now, thanks!