论文标题
模仿是不够的:通过增强学习来挑战驾驶场景,可以鲁andifutiatiation模仿
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
论文作者
论文摘要
模仿学习(IL)是一种使用高质量的人类驾驶数据的简单而有力的方法,可以大规模收集,以产生类似人类的行为。但是,仅基于模仿学习的政策通常无法充分解决安全性和可靠性问题。在本文中,我们展示了如何使用简单的奖励结合使用简单奖励的增强学习,这可以显着提高驾驶政策的安全性和可靠性,而不是单独从模仿中学到的那些。特别是,我们对超过100,000英里的城市驾驶数据进行培训,并在通过不同级别的碰撞可能性分组的测试场景中衡量其有效性。我们的分析表明,虽然模仿可以在示范数据彻底覆盖的低难题场景中表现良好,但我们提出的方法可显着提高最具挑战性的情景(降低38%的失败)。据我们所知,这是使用大量实际人类驾驶数据的自动驾驶中的联合模仿和增强学习方法的第一个应用。
Imitation learning (IL) is a simple and powerful way to use high-quality human driving data, which can be collected at scale, to produce human-like behavior. However, policies based on imitation learning alone often fail to sufficiently account for safety and reliability concerns. In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantially improve the safety and reliability of driving policies over those learned from imitation alone. In particular, we train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision likelihood. Our analysis shows that while imitation can perform well in low-difficulty scenarios that are well-covered by the demonstration data, our proposed approach significantly improves robustness on the most challenging scenarios (over 38% reduction in failures). To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.