论文标题

在没有地面真相的情况下预测特征的含义

Predicting feature imputability in the absence of ground truth

论文作者

McCombe, Niamh, Ding, Xuemei, Prasad, Girijesh, Finn, David P., Todd, Stephen, McClean, Paula L., Wong-Lin, KongFatt

论文摘要

数据插补是处理缺失值的最流行方法,但是在大多数现实生活中,可能会发生大量数据,并且很难或不可能评估数据是否已准确估算(缺乏地面真相)。本文通过提出一种基于有效且简单的主要组件方法来确定是否可以准确估算单个数据功能 - 功能具有符合性来解决这些问题。特别是,即使在存在极端缺失和缺乏地面真理的情况下,我们在主要成分负荷和特征含义之间建立了牢固的线性关系。这项工作将在实际数据归合策略中具有重要意义。

Data imputation is the most popular method of dealing with missing values, but in most real life applications, large missing data can occur and it is difficult or impossible to evaluate whether data has been imputed accurately (lack of ground truth). This paper addresses these issues by proposing an effective and simple principal component based method for determining whether individual data features can be accurately imputed - feature imputability. In particular, we establish a strong linear relationship between principal component loadings and feature imputability, even in the presence of extreme missingness and lack of ground truth. This work will have important implications in practical data imputation strategies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源