论文标题
解开多方强化学习协调的继任功能
Disentangling Successor Features for Coordination in Multi-agent Reinforcement Learning
论文作者
论文摘要
多机构增强学习(MARL)是一个有前途的框架,用于与许多代理商解决复杂的任务。但是,MAL中的一个主要挑战是定义私人公用事业功能,以确保训练分散的代理时协调。这个挑战在稀疏奖励和许多代理商的非结构化任务中尤其普遍。我们表明,后继功能可以通过将单个代理人对全球价值函数的影响与所有其他代理的影响进行分解,从而有助于解决这一挑战。我们使用这种分解来紧凑代表私人公用事业,以支持对非结构化任务中分散的代理的稳定培训。我们使用集中式培训,分散的执行体系结构实施我们的方法,并在各种多机构环境中进行测试。我们的结果表明,相对于现有方法,相对于现有方法的性能和训练时间的提高,并表明,连续功能的分离为MARL提供了有希望的协调方法。
Multi-agent reinforcement learning (MARL) is a promising framework for solving complex tasks with many agents. However, a key challenge in MARL is defining private utility functions that ensure coordination when training decentralized agents. This challenge is especially prevalent in unstructured tasks with sparse rewards and many agents. We show that successor features can help address this challenge by disentangling an individual agent's impact on the global value function from that of all other agents. We use this disentanglement to compactly represent private utilities that support stable training of decentralized agents in unstructured tasks. We implement our approach using a centralized training, decentralized execution architecture and test it in a variety of multi-agent environments. Our results show improved performance and training time relative to existing methods and suggest that disentanglement of successor features offers a promising approach to coordination in MARL.