嘈杂的代理人：通过预测听觉事件来进行自我监督的探索

论文标题

嘈杂的代理人：通过预测听觉事件来进行自我监督的探索

Noisy Agents: Self-supervised Exploration by Predicting Auditory Events

论文作者

Gan, Chuang, Chen, Xiaoyu, Isola, Phillip, Torralba, Antonio, Tenenbaum, Joshua B.

论文摘要

人类整合了多种感觉方式（例如，视觉和音频），以建立对物理世界的因果理解。在这项工作中，我们提出了一种新型的增强学习内在动机（RL），该类型鼓励代理通过听觉事件预测来了解其行动的因果关系。首先，我们允许代理人收集少量的声学数据，并使用K-均值来发现基本的听觉事件簇。然后，我们训练一个神经网络来预测听觉事件，并将预测错误用作内在奖励来指导RL探索。 Atari游戏的实验结果表明，我们的新内在动机大大优于几个最先进的基线。我们在物理环境中进一步可视化嘈杂的代理人的行为，并证明我们新设计的内在奖励会导致物理互动行为的出现（例如，与对象接触）。

Humans integrate multiple sensory modalities (e.g. visual and audio) to build a causal understanding of the physical world. In this work, we propose a novel type of intrinsic motivation for Reinforcement Learning (RL) that encourages the agent to understand the causal effect of its actions through auditory event prediction. First, we allow the agent to collect a small amount of acoustic data and use K-means to discover underlying auditory event clusters. We then train a neural network to predict the auditory events and use the prediction errors as intrinsic rewards to guide RL exploration. Experimental results on Atari games show that our new intrinsic motivation significantly outperforms several state-of-the-art baselines. We further visualize our noisy agents' behavior in a physics environment and demonstrate that our newly designed intrinsic reward leads to the emergence of physical interaction behaviors (e.g. contact with objects).

下载PDF全文

下载文献需遵守相关版权规定

论文标题