学习和控制中最佳顺序决策算法的概率设计

论文标题

学习和控制中最佳顺序决策算法的概率设计

Probabilistic design of optimal sequential decision-making algorithms in learning and control

论文作者

Garrabe, Emiland, Russo, Giovanni

论文摘要

该调查的重点是涉及优化概率功能的某些顺序决策问题。我们讨论了这些问题在学习和控制方面的相关性。该调查是围绕结合问题制定和一组解决方法的框架进行组织的。该公式由无限维度优化问题组成。这些方法来自概率函数空间中搜索最佳解决方案的方法。通过这个总体框架的镜头，我们重新审视了流行的学习和控制算法，表明这些算法自然源于与不同分辨率方法混合的配方的适当变化。我们使代码可用的运行示例对调查进行了补充。最后，还概述了调查引起的许多挑战。

This survey is focused on certain sequential decision-making problems that involve optimizing over probability functions. We discuss the relevance of these problems for learning and control. The survey is organized around a framework that combines a problem formulation and a set of resolution methods. The formulation consists of an infinite-dimensional optimization problem. The methods come from approaches to search optimal solutions in the space of probability functions. Through the lenses of this overarching framework we revisit popular learning and control algorithms, showing that these naturally arise from suitable variations on the formulation mixed with different resolution methods. A running example, for which we make the code available, complements the survey. Finally, a number of challenges arising from the survey are also outlined.

下载PDF全文

下载文献需遵守相关版权规定

论文标题