论文标题
与安全相关要求的跨项目分类
Cross-project Classification of Security-related Requirements
论文作者
论文摘要
我们调查使用分类器以根据在线提供的要求规格培训的与安全相关的要求的可行性。如果不同的要求类型在大型现有需求规范中没有区分,这很有帮助。我们的工作是出于确定安全保证案件的安全要求而激发的,这些案例成为许多具有GDPR和HIPAA等新标准的组织的必要性。我们基于从Google搜索中随机选择的十个要求规格并部分预先标记的研究基础。为了验证模型,我们在每个规范构成一个组的数据上运行10倍的交叉验证。我们的结果表明,从异质数据集中训练模型的可行性,包括来自多个域和不同样式的规格。但是,通过修改预先标记的数据以保持一致性,绩效受益。此外,我们表明只有针对特定规范类型的训练的分类器票价更糟,并且编写要求的方式对分类器准确性没有影响。
We investigate the feasibility of using a classifier for security-related requirements trained on requirement specifications available online. This is helpful in case different requirement types are not differentiated in a large existing requirement specification. Our work is motivated by the need to identify security requirements for the creation of security assurance cases that become a necessity for many organizations with new and upcoming standards like GDPR and HiPAA. We base our investigation on ten requirement specifications, randomly selected from a Google Search and partially pre-labeled. To validate the model, we run 10-fold cross-validation on the data where each specification constitutes a group. Our results indicate the feasibility of training a model from a heterogeneous data set including specifications from multiple domains and in different styles. However, performance benefits from revising the pre-labeled data for consistency. Additionally, we show that classifiers trained only on a specific specification type fare worse and that the way requirements are written has no impact on classifier accuracy.