[Linkpost] Scaling Laws for Generative Mixed-Modal Language Models
post by Amal (asta-vista) · 2023-01-12T14:24:00.921Z · LW · GW · 2 commentsThis is a link post for https://arxiv.org/pdf/2301.03728.pdf
Contents
2 comments
In this paper authors explore the scaling properties of mixed-modal generative models, discovering new scaling laws that unify the contributions of individual modalities and the interactions between them. I find most interesting that they have found so-called competition barrier - when training with multiple modalities, after a certain number of parameters/data, the loss is smaller than if the modalities were trained independently. This seems to predict cross-modal transfer that was sought after but not found (yet) with GATO.
2 comments
Comments sorted by top scores.
comment by Quintin Pope (quintin-pope) · 2023-01-12T14:28:29.032Z · LW(p) · GW(p)
Your link seems broken.
Replies from: asta-vista↑ comment by Amal (asta-vista) · 2023-01-12T14:34:06.594Z · LW(p) · GW(p)
it is fixed now, thanks!