论文标题
具有不完全放射学,病理学,基因组学和人口统计数据的脑癌的生存预测
Survival Prediction of Brain Cancer with Incomplete Radiology, Pathology, Genomics, and Demographic Data
论文作者
论文摘要
整合跨部门多模式数据(例如放射学,病理学,基因组和临床数据)无处不在,在脑癌诊断和生存预测中无处不在。迄今为止,这种整合通常是由人类医师(以及专家小组)进行的,可以是主观的和半定量的。然而,多模式深度学习的最新进展为利用这种过程的方式打开了一个更客观和定量的方式。不幸的是,先前在脑癌存活预测上使用四种模式的艺术受到“完整方式”设置的限制(即所有可用的方式)。因此,关于如何有效地预测脑癌生存的问题仍然存在开放性问题,从放射学,病理学,基因组和人口统计学数据中(例如,可能无法为患者收集一种或多种方式)。例如,我们是否应该同时使用完整和不完整的数据,更重要的是,如何使用这些数据?为了回答前面的问题,我们将跨部门多模式数据的多模式学习推广到缺失的数据设置。我们的贡献是三个方面:1)我们引入了最佳的多模式学习,其中丢失的数据(MMD)管道具有优化的硬件消耗和计算效率; 2)我们将有关放射学,病理,基因组和人口统计学数据的多模式学习扩展到缺失的数据情景; 3)收集了一个大规模的公共数据集(有962名患者),以系统地评估胶质瘤肿瘤存活预测。提出的方法将生存预测的C索引从0.7624提高到0.8053。
Integrating cross-department multi-modal data (e.g., radiological, pathological, genomic, and clinical data) is ubiquitous in brain cancer diagnosis and survival prediction. To date, such an integration is typically conducted by human physicians (and panels of experts), which can be subjective and semi-quantitative. Recent advances in multi-modal deep learning, however, have opened a door to leverage such a process to a more objective and quantitative manner. Unfortunately, the prior arts of using four modalities on brain cancer survival prediction are limited by a "complete modalities" setting (i.e., with all modalities available). Thus, there are still open questions on how to effectively predict brain cancer survival from the incomplete radiological, pathological, genomic, and demographic data (e.g., one or more modalities might not be collected for a patient). For instance, should we use both complete and incomplete data, and more importantly, how to use those data? To answer the preceding questions, we generalize the multi-modal learning on cross-department multi-modal data to a missing data setting. Our contribution is three-fold: 1) We introduce optimal multi-modal learning with missing data (MMD) pipeline with optimized hardware consumption and computational efficiency; 2) We extend multi-modal learning on radiological, pathological, genomic, and demographic data into missing data scenarios; 3) a large-scale public dataset (with 962 patients) is collected to systematically evaluate glioma tumor survival prediction using four modalities. The proposed method improved the C-index of survival prediction from 0.7624 to 0.8053.