论文标题

使用图神经网络的NIDS数据集的调查和纠正以及用于网络攻击检测的标准化特征集推导

Investigation and rectification of NIDS datasets and standardized feature set derivation for network attack detection with graph neural networks

论文作者

Raskovalov, Anton, Gabdullin, Nikita, Dolmatov, Vasily

论文摘要

网络入侵和检测系统(NIDS)对于现代网络中的恶意流量和网络攻击检测至关重要。基于人工智能的NID是强大的工具,可以学习复杂的数据相关性以进行准确的攻击预测。图形神经网络(GNN)提供了分析网络拓扑结构以及流量功能的机会,使其特别适合NIDS应用程序。但是,成功应用此类工具需要大量精心收集和标记的数据进行培训和测试。在本文中,我们检查了不同版本的TON-IOT数据集,并指出某些版本中的不一致之处。我们过滤Ton-iot的完整版本,并提出一个标有Ton-iot-R的新版本。为了确保概括,我们提出了一组新的标准化和紧凑的流动特征,这些特征仅从NetFlowV5兼容的数据中得出。我们将数字数据和标志分为不同的类别,并为数字特征提出了一种新的数据集 - 敏捷标准化方法。这使我们能够保留流量标志的含义,我们建议基于网络协议进行针对性的分析。对于流量分类,我们使用带有修改的节点初始化技术的电子图表算法,使我们可以在节点函数中添加节点度。我们在TON-IOT-R上获得了高分类的准确性,并将其与TON-IOT,NF-TON-IOT和NF-TON-TON-IOT-V2的先前发布的结果进行了比较。我们强调了仔细的数据收集和标签以及适当的数据预处理选择的重要性,并得出结论,由于对交通监控设备的要求较小,同时保持高流量分类精度,因此提出的一组功能集对实际NID。

Network Intrusion and Detection Systems (NIDS) are essential for malicious traffic and cyberattack detection in modern networks. Artificial intelligence-based NIDS are powerful tools that can learn complex data correlations for accurate attack prediction. Graph Neural Networks (GNNs) provide an opportunity to analyze network topology along with flow features which makes them particularly suitable for NIDS applications. However, successful application of such tool requires large amounts of carefully collected and labeled data for training and testing. In this paper we inspect different versions of ToN-IoT dataset and point out inconsistencies in some versions. We filter the full version of ToN-IoT and present a new version labeled ToN-IoT-R. To ensure generalization we propose a new standardized and compact set of flow features which are derived solely from NetFlowv5-compatible data. We separate numeric data and flags into different categories and propose a new dataset-agnostic normalization approach for numeric features. This allows us to preserve meaning of flow flags and we propose to conduct targeted analysis based on, for instance, network protocols. For flow classification we use E-GraphSage algorithm with modified node initialization technique that allows us to add node degree to node features. We achieve high classification accuracy on ToN-IoT-R and compare it with previously published results for ToN-IoT, NF-ToN-IoT, and NF-ToN-IoT-v2. We highlight the importance of careful data collection and labeling and appropriate data preprocessing choice and conclude that the proposed set of features is more applicable for real NIDS due to being less demanding to traffic monitoring equipment while preserving high flow classification accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源