[Linkpost] Growth in FLOPS used to train ML models
post by Derek M. Jones (Derek-Jones) · 2022-03-14T11:28:33.418Z · LW · GW · 3 commentsContents
3 comments
This is a linkpost for https://shape-of-code.com/2022/03/13/growth-in-flops-used-to-train-ml-models/
Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?
3 comments
Comments sorted by top scores.
comment by gwern · 2022-03-14T15:57:27.954Z · LW(p) · GW(p)
Speaking of compute and experience curves, Karpathy just posted about replicating Le Cun's 1989 pre-MNIST digit classifying results and what difference compute & methods make: https://karpathy.github.io/2022/03/14/lecun1989/
Replies from: Derek-Jones↑ comment by Derek M. Jones (Derek-Jones) · 2022-03-14T17:24:55.351Z · LW(p) · GW(p)
Thanks, an interesting read until the author peers into the future. Moore's law is on its last legs, so the historical speed-ups will soon be just that, something that once happened. There are some performance improvements still to come from special purpose cpus, and half-precision floating-point will reduce memory traffic (which can then be traded for cpu perforamnce).
comment by wunan · 2022-03-14T15:20:01.928Z · LW(p) · GW(p)
Thanks for writing! I don't see an actual answer to the question asked in the beginning -- "Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?" Did I miss it?