使用逻辑状态抽象直接近似AIXI

论文标题

使用逻辑状态抽象直接近似AIXI

A Direct Approximation of AIXI Using Logical State Abstractions

论文作者

Yang-Zhao, Samuel, Wang, Tianyu, Ng, Kee Siong

论文摘要

我们提出了将逻辑状态抽象与Aixi（用于增强学习代理的贝叶斯最优性概念）的实际整合，以显着扩展模型类别，即可以将AIXI代理近似于复杂的历史依赖性和结构化环境。状态表示和推理框架基于高阶逻辑，该逻辑可用于定义和枚举非马克维亚和结构化环境上的复杂特征。我们通过调整来自状态抽象理论的$φ$ -MDP优化标准来选择正确的特征子集以形成状态抽象的问题。然后，使用对抽象状态序列的上下文加权的适当概括来实现精确的贝叶斯模型学习。最终的体系结构可以与不同的计划算法集成。在大规模接触网络上控制流行病的实验结果验证了代理的性能。

We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents, to significantly expand the model class that AIXI agents can be approximated over to complex history-dependent and structured environments. The state representation and reasoning framework is based on higher-order logic, which can be used to define and enumerate complex features on non-Markovian and structured environments. We address the problem of selecting the right subset of features to form state abstractions by adapting the $Φ$-MDP optimisation criterion from state abstraction theory. Exact Bayesian model learning is then achieved using a suitable generalisation of Context Tree Weighting over abstract state sequences. The resultant architecture can be integrated with different planning algorithms. Experimental results on controlling epidemics on large-scale contact networks validates the agent's performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题