论文标题
基于稀疏性的特征选择,用于异常亚组发现
Sparsity-based Feature Selection for Anomalous Subgroup Discovery
论文作者
论文摘要
异常的模式检测旨在确定与正常偏差的实例,并且在跨域中广泛适用。在艺术状态下已经提出了多种异常检测技术。但是,通常缺乏有效发现的原则可扩展特征选择方法。现有的特征选择技术通常是通过优化预测结果的性能而不是与预期的系统偏差来进行的。在本文中,我们提出了一个基于稀疏性的自动化特征选择(SAFS)框架,该框架通过特征驱动的优势比来编码系统性结果偏差。 SAFS是一种模型不合时宜的方法,具有不同发现技术的可用性。 SAFS在计算时间的$ 3 \ times $缩短时,在公开可用的重症监护数据集验证时保持检测性能。与多个基准相比,SAFS还会导致出色的性能。
Anomalous pattern detection aims to identify instances where deviation from normalcy is evident, and is widely applicable across domains. Multiple anomalous detection techniques have been proposed in the state of the art. However, there is a common lack of a principled and scalable feature selection method for efficient discovery. Existing feature selection techniques are often conducted by optimizing the performance of prediction outcomes rather than its systemic deviations from the expected. In this paper, we proposed a sparsity-based automated feature selection (SAFS) framework, which encodes systemic outcome deviations via the sparsity of feature-driven odds ratios. SAFS is a model-agnostic approach with usability across different discovery techniques. SAFS achieves more than $3\times$ reduction in computation time while maintaining detection performance when validated on publicly available critical care dataset. SAFS also results in a superior performance when compared against multiple baselines for feature selection.