基于加固学习的对流不稳定流量的控制

论文标题

基于加固学习的对流不稳定流量的控制

Reinforcement-learning-based control of convectively-unstable flows

论文作者

Xu, Da, Zhang, Mengqi

论文摘要

这项工作报告了基于无模型的深增强学习（DRL）流控制策略的应用，以抑制1-D线性化的库拉莫托 - sivashinsky（KS）方程和2-D边界层流中的扰动。前者通常用于模拟平板边界层流中发展的干扰。这些流动系统对流不稳定，能够扩大上游干扰，因此难以控制。控制动作是通过固定位置的体积力来实施的，并且通过降低扰动振幅下游评估控制性能。我们首先证明了基于DRL的控制在KS系统中具有随机上游噪声的有效性。在下游监测的扰动的幅度大大降低，并且所学的策略证明对测量和外部噪声既有稳定性。我们的重点之一是使用无梯度粒子群优化算法最佳地将传感器放置在DRL控制中。在针对不同数量的传感器进行优化过程之后，发现特定的八个传感器放置可以产生最佳的控制性能。优化的传感器放置在KS方程中直接应用于控制2-D Blasius边界层流，并可以有效地减少下游扰动能量。通过流量分析，DRL发现的控制机制是对立控制。此外，发现当流动不稳定性信息嵌入DRL的奖励功能中以惩罚不稳定性时，在这种对流不稳定的流动中，可以进一步改善控制性能。

This work reports the application of a model-free deep-reinforcement-learning-based (DRL) flow control strategy to suppress perturbations evolving in the 1-D linearised Kuramoto-Sivashinsky (KS) equation and 2-D boundary layer flows. The former is commonly used to model the disturbance developing in flat-plate boundary layer flows. These flow systems are convectively unstable, being able to amplify the upstream disturbance, and are thus difficult to control. The control action is implemented through a volumetric force at a fixed position and the control performance is evaluated by the reduction of perturbation amplitude downstream. We first demonstrate the effectiveness of the DRL-based control in the KS system subjected to a random upstream noise. The amplitude of perturbation monitored downstream is significantly reduced and the learnt policy is shown to be robust to both measurement and external noise. One of our focuses is to optimally place sensors in the DRL control using the gradient-free particle swarm optimisation algorithm. After the optimisation process for different numbers of sensors, a specific eight-sensor placement is found to yield the best control performance. The optimised sensor placement in the KS equation is applied directly to control 2-D Blasius boundary layer flows and can efficiently reduce the downstream perturbation energy. Via flow analyses, the control mechanism found by DRL is the opposition control. Besides, it is found that when the flow instability information is embedded in the reward function of DRL to penalise the instability, the control performance can be further improved in this convectively-unstable flow.

下载PDF全文

下载文献需遵守相关版权规定

论文标题