可扩展探索的对手目标

论文标题

可扩展探索的对手目标

An Adversarial Objective for Scalable Exploration

论文作者

Bucher, Bernadette, Schmeckpeper, Karl, Matni, Nikolai, Daniilidis, Kostas

论文摘要

基于模型的好奇心结合了主动学习方法以最佳抽样的方式以及好奇文献中提出的探索的基于信息增益的激励措施。现有的基于模型的好奇心方法旨在通过努力扩展到机器人任务中使用的许多预测规划管道的方法来近似预测不确定性。我们使用对抗性好奇方法解决这些可伸缩性问题，最大程度地限制了歧视者网络给出的分数。该鉴别因子通过预测模型共同优化，并使我们的主动学习方法可以采样观测和行动的序列，从而导致预测被认为是歧视者最不现实的。我们证明了逐渐增加的优势，因为计算限制了我们的对抗性好奇方法与模拟环境中基于模型的探索策略相比。我们进一步证明了对抗性好奇方法扩展到机器人操纵预测计划管道的能力，在该管道中，我们提高了域转移问题的样本效率和预测性能。

Model-based curiosity combines active learning approaches to optimal sampling with the information gain based incentives for exploration presented in the curiosity literature. Existing model-based curiosity methods look to approximate prediction uncertainty with approaches which struggle to scale to many prediction-planning pipelines used in robotics tasks. We address these scalability issues with an adversarial curiosity method minimizing a score given by a discriminator network. This discriminator is optimized jointly with a prediction model and enables our active learning approach to sample sequences of observations and actions which result in predictions considered the least realistic by the discriminator. We demonstrate progressively increasing advantages as compute is restricted of our adversarial curiosity approach over leading model-based exploration strategies in simulated environments. We further demonstrate the ability of our adversarial curiosity method to scale to a robotic manipulation prediction-planning pipeline where we improve sample efficiency and prediction performance for a domain transfer problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题