终身逆增强学习

论文标题

终身逆增强学习

Lifelong Inverse Reinforcement Learning

论文作者

Mendez, Jorge A., Shivkumar, Shashank, Eaton, Eric

论文摘要

通过模仿用户，从演示中学习的方法（LFD）在获取行为策略方面取得了成功。但是，即使对于一项任务，LFD也可能需要大量的演示。对于必须通过演示学习许多任务的多功能代理，如果孤立地学习每个任务，此过程将大大负担用户的负担。为了应对这一挑战，我们介绍了从演示中学习的新颖问题，该问题使代理商能够不断地基于从先前的演示任务中学到的知识，以加速学习新任务，从而减少了所需的示范数量。作为解决这个问题的一种解决方案，我们提出了第一种终身学习方法来进行逆强化学习，该方法通过演示来学习连续的任务，不断地在任务之间转移知识以提高绩效。

Methods for learning from demonstration (LfD) have shown success in acquiring behavior policies by imitating a user. However, even for a single task, LfD may require numerous demonstrations. For versatile agents that must learn many tasks via demonstration, this process would substantially burden the user if each task were learned in isolation. To address this challenge, we introduce the novel problem of lifelong learning from demonstration, which allows the agent to continually build upon knowledge learned from previously demonstrated tasks to accelerate the learning of new tasks, reducing the amount of demonstrations required. As one solution to this problem, we propose the first lifelong learning approach to inverse reinforcement learning, which learns consecutive tasks via demonstration, continually transferring knowledge between tasks to improve performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题