论文标题

使用多编写几何矩阵完成(MGMC)在不完整的医学数据集中同时进行插补和疾病分类

Simultaneous imputation and disease classification in incomplete medical datasets using Multigraph Geometric Matrix Completion (MGMC)

论文作者

Vivar, Gerome, Kazi, Anees, Burwinkel, Hendrik, Zwergal, Andreas, Navab, Nassir, Ahmadi, Seyed-Ahmad

论文摘要

大规模的基于人群的医学研究是改善诊断,监测和治疗疾病的关键资源。它们还用作临床决策支持系统的推动者,特别是使用机器学习(ML)的计算机辅助诊断(CADX)。文献中已经提出了许多CADX的ML方法。但是,这些方法具有完整的数据可用性,在临床数据中并不总是可行的。为了说明丢失的数据,将删除或估算不完整的数据样本,这可能导致数据偏差,并可能对分类性能产生负面影响。作为解决方案,我们提出了通过Multigraph几何矩阵完成(MGMC)对不完整医学数据集的插补和疾病预测的端到端学习。 MGMC使用多个复发图卷积网络,每个图都代表基于关键临床元元功能(例如年龄,性别或认知功能)的独立种群模型。来自本地患者社区的图形信号聚集,以及通过自我注意力通过自我注意力融合的多数信号融合,对矩阵重建和分类性能具有正则作用。我们提出的方法能够将相关功能归为相关功能,并在两个公开可用的医疗数据集上进行准确的分类。与最先进的方法相比,我们从经验上表明了我们提出的方法在分类和归纳性能方面的优势。 MGMC可以在多模式和不完整的医疗数据集中进行疾病预测。这些发现可以用作利用不完整数据集的未来CADX方法的基线。

Large-scale population-based studies in medicine are a key resource towards better diagnosis, monitoring, and treatment of diseases. They also serve as enablers of clinical decision support systems, in particular Computer Aided Diagnosis (CADx) using machine learning (ML). Numerous ML approaches for CADx have been proposed in literature. However, these approaches assume full data availability, which is not always feasible in clinical data. To account for missing data, incomplete data samples are either removed or imputed, which could lead to data bias and may negatively affect classification performance. As a solution, we propose an end-to-end learning of imputation and disease prediction of incomplete medical datasets via Multigraph Geometric Matrix Completion (MGMC). MGMC uses multiple recurrent graph convolutional networks, where each graph represents an independent population model based on a key clinical meta-feature like age, sex, or cognitive function. Graph signal aggregation from local patient neighborhoods, combined with multigraph signal fusion via self-attention, has a regularizing effect on both matrix reconstruction and classification performance. Our proposed approach is able to impute class relevant features as well as perform accurate classification on two publicly available medical datasets. We empirically show the superiority of our proposed approach in terms of classification and imputation performance when compared with state-of-the-art approaches. MGMC enables disease prediction in multimodal and incomplete medical datasets. These findings could serve as baseline for future CADx approaches which utilize incomplete datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源