在线执行器选择和线性二次调节的控制器设计，使用未知系统模型

论文标题

在线执行器选择和线性二次调节的控制器设计，使用未知系统模型

Online Actuator Selection and Controller Design for Linear Quadratic Regulation with Unknown System Model

论文作者

Ye, Lintao, Chi, Ming, Liu, Zhi-Wei, Gupta, Vijay

论文摘要

我们研究线性二次调节的同时执行器选择和控制器设计问题，并在长度为$ t $的有限视野和未知系统模型上使用高斯噪声。我们考虑问题的情节性和非剧集设置，并提出在线算法，这些算法指定了在基础性约束下使用的两个执行器集和与所选执行器集对应的控件。在情节环境中，与系统的互动分解为$ n $插件，每个情节都从给定的初始条件重新启动并具有长度$ t $。在非剧本环境中，相互作用连续进行。我们的在线算法利用多层匪徒算法选择执行器的集合和确定性对等方法设计相应的控件。我们表明，我们的在线算法产生$ \ sqrt {n} $ - 遗憾的是情节设置，$ t^{2/3} $ - 遗憾的是非剧烈的设置。我们扩展了算法设计和分析，以显示有关候选执行器总数和基数约束的可扩展性。我们在数值上验证了我们的理论结果。

We study the simultaneous actuator selection and controller design problem for linear quadratic regulation with Gaussian noise over a finite horizon of length $T$ and unknown system model. We consider both episodic and non-episodic settings of the problem and propose online algorithms that specify both the sets of actuators to be utilized under a cardinality constraint and the controls corresponding to the sets of selected actuators. In the episodic setting, the interaction with the system breaks into $N$ episodes, each of which restarts from a given initial condition and has length $T$. In the non-episodic setting, the interaction goes on continuously. Our online algorithms leverage a multiarmed bandit algorithm to select the sets of actuators and a certainty equivalence approach to design the corresponding controls. We show that our online algorithms yield $\sqrt{N}$-regret for the episodic setting and $T^{2/3}$-regret for the non-episodic setting. We extend our algorithm design and analysis to show scalability with respect to both the total number of candidate actuators and the cardinality constraint. We numerically validate our theoretical results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题