论文标题
通过图内核心Infomax在电子健康记录上进行自我监督的表示
Self-supervised Representation Learning on Electronic Health Records with Graph Kernel Infomax
论文作者
论文摘要
学习电子健康记录(EHRS)表示是一个杰出但未被发现的研究主题。它受益于各种临床决策支持应用,例如药物结果预测或患者相似性搜索。当前的方法集中在特定于任务的标签监督上,对矢量化的顺序EHR,这不适用于大规模无监督的方案。最近,对比度学习在自我监督的表示问题上显示出巨大的成功。但是,复杂的时间性通常会降低表现。我们提出了Graph内核Infomax,这是一种关于EHR图形表示的自我监督的图表内学习方法,以克服先前的问题。与最先进的情况不同,我们不会更改图形结构来构建增强视图。取而代之的是,我们使用内核子空间扩展将节点嵌入两个几何不同的流形视图中。整个框架是通过通过常用的对比目标在这两个多种视图上对比的节点和图形表示训练的。从经验上讲,使用公开可用的基准EHR数据集,我们的方法在超过最先进的临床下游任务上产生了表现。从理论上讲,距离指标的变化自然会在不改变图形结构的情况下创建不同的视图作为数据增强。
Learning Electronic Health Records (EHRs) representation is a preeminent yet under-discovered research topic. It benefits various clinical decision support applications, e.g., medication outcome prediction or patient similarity search. Current approaches focus on task-specific label supervision on vectorized sequential EHR, which is not applicable to large-scale unsupervised scenarios. Recently, contrastive learning shows great success on self-supervised representation learning problems. However, complex temporality often degrades the performance. We propose Graph Kernel Infomax, a self-supervised graph kernel learning approach on the graphical representation of EHR, to overcome the previous problems. Unlike the state-of-the-art, we do not change the graph structure to construct augmented views. Instead, we use Kernel Subspace Augmentation to embed nodes into two geometrically different manifold views. The entire framework is trained by contrasting nodes and graph representations on those two manifold views through the commonly used contrastive objectives. Empirically, using publicly available benchmark EHR datasets, our approach yields performance on clinical downstream tasks that exceeds the state-of-the-art. Theoretically, the variation on distance metrics naturally creates different views as data augmentation without changing graph structures.