论文标题
牛顿对Helmholtz连续游戏的分解优化
Newton Optimization on Helmholtz Decomposition for Continuous Games
论文作者
论文摘要
许多学习问题涉及多种代理来优化不同的交互式功能。在这些问题中,由于设置的非平稳性和每个代理的不同利益,标准策略梯度算法失败了。实际上,算法必须考虑这些系统的复杂动力学,以确保快速收敛到(局部)NASH平衡。在本文中,我们提出了NOHD(牛顿对Helmholtz分解的优化),这是一种基于系统动力学在其无关(潜在)和电磁阀(Hamiltonian)组件中的牛顿样算法。此方法可确保纯粹的无关系和纯电磁系统中的二次收敛。此外,我们表明NOHD被一般多代理系统中的稳定固定点所吸引,并被严格的鞍形固定点所吸引。最后,我们从经验上将NOHD的性能与某些Bimatrix游戏和连续的GridWorld环境中的最先进算法进行了比较。
Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take into account the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the dynamics of the system in its irrotational (Potential) and solenoidal (Hamiltonian) component. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with that of state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.