Deep Reinforcement Learning at the Edge of the Statistical Precipice
Reinforcement learning, Model evaluation
In this paper, we argue that reliable evaluation in the few run deep RL regime cannot ignore the uncertainty in results without running the risk of slowing down progress in the field.