Posts

Comments

Comment by Melaenis Crito on New Scaling Laws for Large Language Models · 2024-01-17T15:02:26.699Z · LW · GW

So until wafer-scale chips decrease the cost of compute ten times, and Google also decides all it really needs for AGI is to put ten times as much money into LM's, we've seen the largest LM's we're likely to see. However long that may be.

 

The numbers in the DeepMind figure indicate an exponential increase in FLOPS. With compute increasing after Moore's law and compute usage in AI even faster, why would larger models be most likely impossible? Based on these trends, it looks very reasonable to me, that the trend of larger models will continue.