What should an Einstein-like figure in Machine Learning do?

post by Razied · 2020-08-05T23:52:14.539Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    7 lsusr
None
No comments

Suppose you are an extremely talented but still unknown machine learning researcher (like Einstein of patent-clerk days), and you discover something very important, maybe a few tricks that allow you to train neural nets 4 orders of magnitude faster than anyone else. You can train something of GPT-3 level on your personal rig in a few weeks, those tricks also generalize to larger networks. What should you do with this information to benefit humanity?

Publishing seems out of the question, as it would probably mean that AGI is within reach of a large project. But you personally cannot yet produce it, you can produce GPT-3 like performance, but nothing more. You could do nothing at all, keeping silent and hoping that your discoveries are not replicated by the main players. Or maybe you could try to covertly help OpenAI, which would then have a massive advantage, and enough lead time to properly treat safety. What would you do in this scenario?

Answers

answer by lsusr · 2020-08-06T01:02:03.237Z · LW(p) · GW(p)

This is not a hypothetical to me. I currently have, running on my computer, a few tricks that allow me to train small data systems much faster than anyone else. The first thing I did was figure out if they could bring an AGI within reach of a large project. Is it? I don't know. AGI faces bigger obstacles [? · GW] than a mere 4 orders of magnitude of compute power.

The second thing I did was chat with machine learning engineers from the big tech companies. They are not using these tricks and are not interested in adopting them. This will not surprise [? · GW] the Lisp hackers reading this post.

There might be a hedge fund out there that can use my tech. But a cursory investigation suggests that the majority of hedge funds are not doing anything sophisticated enough to require it.

So now I'm using my ML system to train an IMU-based gesture detection algorithm to track how much people eat in order to help prevent obesity-related diseases.

comment by Gurkenglas · 2020-08-07T22:30:43.658Z · LW(p) · GW(p)

Can you locally replicate GPT? For example, can GPT-you compress WebText better than GPT-2?

Replies from: lsusr
comment by lsusr · 2020-08-08T00:20:37.357Z · LW(p) · GW(p)

I am not yet prepared to release the details of what the system can and cannot do.

comment by mishka · 2023-07-13T14:38:09.013Z · LW(p) · GW(p)

The second thing I did was chat with machine learning engineers from the big tech companies. They are not using these tricks and are not interested in adopting them. This will not surprise the Lisp hackers reading this post.

Right...

A good post, relevant to a number of people...

(But their inventions are all different, and it is not all that easy for them to find each other and join forces together, especially in a safety-conscious manner.)

No comments

Comments sorted by top scores.