论文标题
上下文RL中的超参数是高度情况的
Hyperparameters in Contextual RL are Highly Situational
论文作者
论文摘要
尽管增强学习(RL)在游戏和模拟中显示出令人印象深刻的结果,但在不断变化的环境条件和超参数下,RL的现实应用遭受了其不稳定的影响。我们通过证明自动超参数优化(HPO)方法发现的超参数不仅取决于手头的问题,而且还取决于状态对环境动力学的描述程度,我们给出了这种不稳定程度的第一印象。具体而言,我们表明,如果显示环境因素如何变化,则上下文RL中的代理需要不同的超参数。此外,对于两种设置,找到足够的超参数配置并不容易,进一步强调了对超参数如何影响RL学习和概括的研究。
Although Reinforcement Learning (RL) has shown impressive results in games and simulation, real-world application of RL suffers from its instability under changing environment conditions and hyperparameters. We give a first impression of the extent of this instability by showing that the hyperparameters found by automatic hyperparameter optimization (HPO) methods are not only dependent on the problem at hand, but even on how well the state describes the environment dynamics. Specifically, we show that agents in contextual RL require different hyperparameters if they are shown how environmental factors change. In addition, finding adequate hyperparameter configurations is not equally easy for both settings, further highlighting the need for research into how hyperparameters influence learning and generalization in RL.