Posts
Comments
Comment by
Melaenis Crito on
New Scaling Laws for Large Language Models ·
2024-01-17T15:02:26.699Z ·
LW ·
GW
So until wafer-scale chips decrease the cost of compute ten times, and Google also decides all it really needs for AGI is to put ten times as much money into LM's, we've seen the largest LM's we're likely to see. However long that may be.
The numbers in the DeepMind figure indicate an exponential increase in FLOPS. With compute increasing after Moore's law and compute usage in AI even faster, why would larger models be most likely impossible? Based on these trends, it looks very reasonable to me, that the trend of larger models will continue.