折扣游戏和能源游戏的多面价值迭代

论文标题

折扣游戏和能源游戏的多面价值迭代

Polyhedral value iteration for discounted games and energy games

论文作者

Kozachinskiy, Alexander

论文摘要

我们提出了一种确定性算法，以$ n^{o（1）} \ cdot（2 + \ sqrt {2}）^n $ - time以$ n $ nodes求解折扣游戏。对于双分折扣游戏，我们的算法以$ n^{o（1）} \ cdot 2^n $ - 时间运行。在我们的工作之前，无论折现因子的折扣系数如何，都没有时间运行的确定性算法$ 2^{o（n \ log n）} $。我们称我们的方法是多面体价值迭代。我们依靠一个众所周知的事实，即可以从所谓的最优方程中找到折扣游戏的值。在算法中，我们考虑通过放松最优方程获得的多面体。我们通过尽可能地沿着精心选择的移动移动来迭代该多面体的边界。这一直持续到当前点满足最优方程。我们的方法受到了Dorfman等人的最新算法的极大启发。（ICALP 2019）用于能源游戏。为了完整性，我们介绍了他们的算法，以多面体价值迭代。与原始算法不同，我们的博览会不需要边缘权重作为整数，而是为任意的实际权重工作。

We present a deterministic algorithm, solving discounted games with $n$ nodes in $n^{O(1)}\cdot (2 + \sqrt{2})^n$-time. For bipartite discounted games our algorithm runs in $n^{O(1)}\cdot 2^n$-time. Prior to our work no deterministic algorithm running in time $2^{o(n\log n)}$ regardless of the discount factor was known. We call our approach polyhedral value iteration. We rely on a well-known fact that the values of a discounted game can be found from the so-called optimality equations. In the algorithm we consider a polyhedron obtained by relaxing optimality equations. We iterate points on the border of this polyhedron by moving each time along a carefully chosen shift as far as possible. This continues until the current point satisfies optimality equations. Our approach is heavily inspired by a recent algorithm of Dorfman et al. (ICALP 2019) for energy games. For completeness, we present their algorithm in terms of polyhedral value iteration. Our exposition, unlike the original algorithm, does not require edge weights to be integers and works for arbitrary real weights.

下载PDF全文

下载文献需遵守相关版权规定

论文标题