如何从风险中学习：明确的风险实用性加强学习，以实现高效且安全的驾驶策略

论文标题

如何从风险中学习：明确的风险实用性加强学习，以实现高效且安全的驾驶策略

How to Learn from Risk: Explicit Risk-Utility Reinforcement Learning for Efficient and Safe Driving Strategies

论文作者

Schmidt, Lukas M., Rietsch, Sebastian, Plinge, Axel, Eskofier, Bjoern M., Mutschler, Christopher

论文摘要

自主驾驶有可能彻底改变流动性，因此是一个积极的研究领域。实际上，自动驾驶汽车的行为必须是可以接受的，即高效，安全和可解释的。尽管香草钢筋学习（RL）找到了表现的行为策略，但它们通常是不安全且无法解释的。安全性是通过安全的RL方法引入的，但是它们仍然无法解释，因为学习的行为在没有分别进行建模的情况下共同优化了安全性和性能。可解释的机器学习很少应用于RL。本文提出了SAFEDQN，它允许在仍然有效的同时使自动驾驶汽车的行为安全可解释。 SAFEDQN在算法透明的同时，在预期风险和效用的效用之间提供了可以理解的语义权衡。我们表明，SAFEDQN为各种场景找到了可解释且安全的驾驶政策，并证明了最先进的显着性技术如何有助于评估风险和实用性。

Autonomous driving has the potential to revolutionize mobility and is hence an active area of research. In practice, the behavior of autonomous vehicles must be acceptable, i.e., efficient, safe, and interpretable. While vanilla reinforcement learning (RL) finds performant behavioral strategies, they are often unsafe and uninterpretable. Safety is introduced through Safe RL approaches, but they still mostly remain uninterpretable as the learned behaviour is jointly optimized for safety and performance without modeling them separately. Interpretable machine learning is rarely applied to RL. This paper proposes SafeDQN, which allows to make the behavior of autonomous vehicles safe and interpretable while still being efficient. SafeDQN offers an understandable, semantic trade-off between the expected risk and the utility of actions while being algorithmically transparent. We show that SafeDQN finds interpretable and safe driving policies for a variety of scenarios and demonstrate how state-of-the-art saliency techniques can help to assess both risk and utility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题