多代理增强学习的递归推理图

论文标题

多代理增强学习的递归推理图

Recursive Reasoning Graph for Multi-Agent Reinforcement Learning

论文作者

Ma, Xiaobai, Isele, David, Gupta, Jayesh K., Fujimura, Kikuo, Kochenderfer, Mykel J.

论文摘要

多代理增强学习（MARL）为多种互动的多种互动提供了一种有效的方法，以同时学习政策。但是，在需要复杂相互作用的情况下，现有的算法可能无法准确预测自我作用对其他代理的影响。结合推理其他代理的潜在反应的能力，可以使代理人制定更有效的策略。本文在集中式培训 - 分类 - 执行框架中采用了递归推理模型，以帮助学习代理更好地与他人合作或竞争。所提出的算法（称为递归推理图（R2G））显示了多个多代理粒子和机器人游戏的最新性能。

Multi-agent reinforcement learning (MARL) provides an efficient way for simultaneously learning policies for multiple agents interacting with each other. However, in scenarios requiring complex interactions, existing algorithms can suffer from an inability to accurately anticipate the influence of self-actions on other agents. Incorporating an ability to reason about other agents' potential responses can allow an agent to formulate more effective strategies. This paper adopts a recursive reasoning model in a centralized-training-decentralized-execution framework to help learning agents better cooperate with or compete against others. The proposed algorithm, referred to as the Recursive Reasoning Graph (R2G), shows state-of-the-art performance on multiple multi-agent particle and robotics games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题