论文标题
学习层次结构行为和运动计划
Learning hierarchical behavior and motion planning for autonomous driving
论文作者
论文摘要
基于学习的驾驶解决方案是一个新的自主驾驶分支,有望通过从数据中学习基本机制来简化驾驶的建模。为了改善基于学习的驾驶解决方案的战术决策,我们介绍了分层行为和运动计划(HBMP),以明确对基于学习的解决方案的行为进行建模。由于行为和运动的耦合动作空间,解决长途驾驶任务的增强学习(RL)解决HBMP问题是一项挑战。我们通过整合基于经典抽样的运动计划者来改变HBMP问题,其中最佳成本被视为高级行为学习的奖励。结果,这种配方会降低动作空间并多样化的奖励而不会失去HBMP的最佳性。此外,我们为跨模拟平台和现实环境提出了一个可共享的表示,以用于输入感官数据,以便可以使用基于快速事件的模拟器Sumo进行训练的模型,可用于初始化和加速基于动态的模拟器Carla中的RL训练。实验结果证明了该方法的有效性。此外,该模型已成功地转移到现实世界中,从而验证了概括能力。
Learning-based driving solution, a new branch for autonomous driving, is expected to simplify the modeling of driving by learning the underlying mechanisms from data. To improve the tactical decision-making for learning-based driving solution, we introduce hierarchical behavior and motion planning (HBMP) to explicitly model the behavior in learning-based solution. Due to the coupled action space of behavior and motion, it is challenging to solve HBMP problem using reinforcement learning (RL) for long-horizon driving tasks. We transform HBMP problem by integrating a classical sampling-based motion planner, of which the optimal cost is regarded as the rewards for high-level behavior learning. As a result, this formulation reduces action space and diversifies the rewards without losing the optimality of HBMP. In addition, we propose a sharable representation for input sensory data across simulation platforms and real-world environment, so that models trained in a fast event-based simulator, SUMO, can be used to initialize and accelerate the RL training in a dynamics based simulator, CARLA. Experimental results demonstrate the effectiveness of the method. Besides, the model is successfully transferred to the real-world, validating the generalization capability.