Posts
Comments
Comment by
Howuhh on
Proposal: Scaling laws for RL generalization ·
2023-12-11T11:05:08.124Z ·
LW ·
GW
Especially for such needs and ideas we have recently released an extremely efficient reimplementation of XLand based on MiniGrid, XLand-MiniGrid. We completely rewrote it in JAX, so a simple PPO can get millions of FPS during training. On 8 a100 GPUs users can expect to get 1 trillion steps in under 40 hours. Have fun experimenting!
Twitter announcement: https://twitter.com/vladkurenkov/status/1731709425524543550
Source code:
https://github.com/corl-team/xland-minigrid