论文标题
建立通常可重复使用的代理 - 环境相互作用模型
Build generally reusable agent-environment interaction models
论文作者
论文摘要
本文解决了如何预先培训模型的问题,并使其通常可重复使用的骨干来进行下游任务学习。在预训练中,我们提出了一种方法,该方法通过学习涵盖各种任务的庞大经验中的学习域不变的后继功能来构建代理环境的交互模型,然后将它们离散为行为原型,从而导致体现的集合结构。为了使模型通常可以重复使用,以用于下游任务学习,我们建议(1)(1)通过将新任务的观察力对来保留以前的知识的体现投影,并将其投影到体现的集合结构上,并且(2)预测的Bellman更新为新任务设置增加了学习可塑性。我们提供初步结果,这些结果表明基于预训练的体现结构的下游任务学习可以处理任务目标,环境动力学和传感器模式的看不见的变化。
This paper tackles the problem of how to pre-train a model and make it generally reusable backbones for downstream task learning. In pre-training, we propose a method that builds an agent-environment interaction model by learning domain invariant successor features from the agent's vast experiences covering various tasks, then discretize them into behavior prototypes which result in an embodied set structure. To make the model generally reusable for downstream task learning, we propose (1) embodied feature projection that retains previous knowledge by projecting the new task's observation-action pair to the embodied set structure and (2) projected Bellman updates which add learning plasticity for the new task setting. We provide preliminary results that show downstream task learning based on a pre-trained embodied set structure can handle unseen changes in task objectives, environmental dynamics and sensor modalities.