通过增强学习的网络物理系统的自动化对手仿真

论文标题

通过增强学习的网络物理系统的自动化对手仿真

Automated Adversary Emulation for Cyber-Physical Systems via Reinforcement Learning

论文作者

Bhattacharya, Arnab, Ramachandran, Thiagarajan, Banik, Sandeep, Dowling, Chase P., Bopardikar, Shaunak D.

论文摘要

对手仿真是一种进攻性练习，可全面评估系统对网络攻击的弹性。但是，对手仿真通常是一个手动过程，使其成本高昂且难以在具有复杂动态，脆弱性和操作不确定性的网络物理系统（CPS）中部署。在本文中，我们开发了一种自动化的域感知方法，以用于CPS的对手仿真。我们制定了马尔可夫决策过程（MDP）模型，以确定与网络（离散）和物理（连续）组件以及相关物理动力学的混合攻击图上的最佳攻击顺序。我们采用基于模型和无模型的增强学习（RL）方法来以易处理的方式解决离散连续的MDP。作为基准，我们还开发了一种贪婪的攻击算法，并将其与RL程序进行比较。我们通过一项关于建筑物中传感器欺骗攻击的数字研究来总结我们的发现，以比较提出的算法的性能和解决方案质量。

Adversary emulation is an offensive exercise that provides a comprehensive assessment of a system's resilience against cyber attacks. However, adversary emulation is typically a manual process, making it costly and hard to deploy in cyber-physical systems (CPS) with complex dynamics, vulnerabilities, and operational uncertainties. In this paper, we develop an automated, domain-aware approach to adversary emulation for CPS. We formulate a Markov Decision Process (MDP) model to determine an optimal attack sequence over a hybrid attack graph with cyber (discrete) and physical (continuous) components and related physical dynamics. We apply model-based and model-free reinforcement learning (RL) methods to solve the discrete-continuous MDP in a tractable fashion. As a baseline, we also develop a greedy attack algorithm and compare it with the RL procedures. We summarize our findings through a numerical study on sensor deception attacks in buildings to compare the performance and solution quality of the proposed algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题