post by [deleted]
This is a link post for
Comments sorted by top scores.
comment by Bjartur Tómas ·
2021-09-25T16:06:46.518Z · LW(p) · GW(p)
Question: Why can't we use MuZero on Core Wars to train models to code? MuZero uses self-play to master games, and Core Wars is a programming game amenable to self-play. Why has no one tried this? Or if they have, why does it not work?
Why my research failed: Googling this question, I get results that describe how MuZero works, DeepMind's blog post on the topic, and various Hacker News threads that do not address my question.Replies from: Frederik
↑ comment by Frederik ·
2021-09-25T19:41:08.826Z · LW(p) · GW(p)
Background: AI Master Student, some practice in RL
I don't think there is a fundamental reason that we can't but it's rather that no one did it. I don't know a definitive answer as to why but here are some options:
- too obscure ('no-one has thought of it', or 'no-one thought it was a good idea, it's only assembly-like code after all')
- high barrier of entry (you need to write an RL environment that you can query fast, and you need a lot of compute)
- this makes it harder for individuals or small teams to do this, and larger players like DeepMind and OpenAI might have different priorities
- now that we have Codex (and soon^TM its newer, allegedly much better version), there might not be any (economic or scientific) reason to do this
- What's the value you'd get out of a code wars expert model? I glanced at the Wikipedia page. How would you convert its outputs to a useful program, that does more than gobble up all your memory?