Posts

Comments

Comment by Edward Guo (edward-guo) on EfficientZero: How It Works · 2021-11-28T09:45:15.349Z · LW · GW

I would be careful using reinforcement learning to check for theoretical maximization of training data, given that plenty of agents generally do not start out with 0 bits of information about the environment. The shape of input data/action space is still useful information.

Even in designing the agent itself, it seems to me that general knowledge of human-related systems could be introduced into the architecture.

Selecting the architecture that gives us highest upper-bound for information utilization in a system is also, in some sense, inserting extra data.