论文标题
马尔可夫自动防御的决策过程
Markov Decision Process For Automatic Cyber Defense
论文作者
论文摘要
安全分析师发现或防御网络攻击是一项挑战。此外,传统的国防部署方法要求安全分析师在对辩护部署的不确定性面前手动执行防御。结果,必须开发自动化和弹性的防御部署机制来阻止新一代攻击。在本文中,我们提出了一个基于马尔可夫决策过程(MDP)和Q学习的框架,以自动为网络系统状态生成最佳的防御解决方案。该框架由四个阶段组成。模型初始化阶段,模型生成阶段,Q学习阶段和结论阶段。提出的模型将实际网络信息作为输入收集,然后将它们构建为结构数据。我们在模型中实施Q学习过程,以了解特定状态下的国防行动的质量。为了研究提出的模型的可行性,我们执行了仿真实验,结果表明该模型可以降低网络攻击中网络系统的风险。此外,该实验表明,当使用不同的参数进行Q学习时,该模型已显示出一定的灵活性。
It is challenging for a security analyst to detect or defend against cyber-attacks. Moreover, traditional defense deployment methods require the security analyst to manually enforce the defenses in the presence of uncertainties about the defense to deploy. As a result, it is essential to develop an automated and resilient defense deployment mechanism to thwart the new generation of attacks. In this paper, we propose a framework based on Markov Decision Process (MDP) and Q-learning to automatically generate optimal defense solutions for networked system states. The framework consists of four phases namely; the model initialization phase, model generation phase, Q-learning phase, and the conclusion phase. The proposed model collects real network information as inputs and then builds them into structural data. We implement a Q-learning process in the model to learn the quality of a defense action in a particular state. To investigate the feasibility of the proposed model, we perform simulation experiments and the result reveals that the model can reduce the risk of network systems from cyber attacks. Furthermore, the experiment shows that the model has shown a certain level of flexibility when different parameters are used for Q-learning.