论文标题

带有单个控制器的马尔可夫游戏中的虚拟玩法

Fictitious Play in Markov Games with Single Controller

论文作者

Sayin, Muhammed O., Zhang, Kaiqing, Ozdaglar, Asuman

论文摘要

某些但重要的战略形式游戏,包括零和相同的利益游戏,具有虚构的游戏(FPP),即,在虚拟的游戏动态中形成的信念总是会融合到这些游戏的重复游戏中。这种收敛结果被视为游戏理论平衡分析的(行为)理由。马尔可夫游戏(MGS),也称为随机游戏,将战略形式游戏的重复游戏推广到具有马尔可夫州过渡的动态多状态设置。特别是,MGS是用于多项式增强学习的标准模型 - 学习和游戏中的恢复研究领域,并且他们的游戏理论平衡分析也进行了广泛进行。但是,某些类别的MG是否具有FPP(即是否存在平衡分析的行为理由)仍然难以捉摸。在本文中,我们研究了MGS的虚拟游戏动力学的新变体,并显示了其与N-Player相同利益的MG中的NE融合,其中单人控制状态过渡。此类游戏在通信,控制和经济学应用中感兴趣。我们的结果以及[Sayin等人的最新结果。 [2020]建立了具有单个控制器的两人零和零和n-player相同利益的MG(站在从完全竞争到完全合作的MG频谱的两个不同端)的FPP。

Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have the fictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [Sayin et al. 2020] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源