Posts

Comments

Comment by Howuhh on Proposal: Scaling laws for RL generalization · 2023-12-11T11:05:08.124Z · LW · GW

Especially for such needs and ideas we have recently released an extremely efficient reimplementation of XLand based on MiniGrid, XLand-MiniGrid. We completely rewrote it in JAX, so a simple PPO can get millions of FPS during training. On 8 a100 GPUs users can expect to get 1 trillion steps in under 40 hours. Have fun experimenting!

Twitter announcement: https://twitter.com/vladkurenkov/status/1731709425524543550
Source code:
https://github.com/corl-team/xland-minigrid