论文标题
从离线观察中证明有效的第三人称模仿
Provably Efficient Third-Person Imitation from Offline Observation
论文作者
论文摘要
模仿学习中的领域适应是提高普遍性的重要一步。但是,即使在同构马尔可夫决策过程之间转移的第三人称模仿设置有限的设置,也没有明确保证转移的政策。我们提供了与问题相关的,统计学习保证,可从离线环境中观察到第三人称模仿,并且在线环境中的性能下降。
Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.