论文标题
PANREP:用于在异质图中提取通用节点嵌入的图形神经网络
PanRep: Graph neural networks for extracting universal node embeddings in heterogeneous graphs
论文作者
论文摘要
学习无监督的节点嵌入促进了几个下游任务,例如节点分类和链接预测。如果节点嵌入为通用,则该节点被设计为被各种下游任务使用并受益。这项工作介绍了图形神经网络(GNN)模型Panrep,用于无监督的异质图的通用节点表示。 PANREP由一个GNN编码器组成,该编码器获得节点嵌入和四个解码器,每个解码器都捕获不同的拓扑和节点特征属性。遵守这些属性,新颖的无监督框架学习了适用于不同下游任务的通用嵌入。可以对Panrep进行微调,以说明可能的有限标签。在此操作环境中,PANREP被认为是提取异源图数据的节点嵌入的验证模型。 PANREP优于节点分类和链接预测中的所有无监督和某些监督方法,尤其是当被监督方法的标记数据很小时。 Panrep-ft(带微调)的表现优于所有其他监督方法,这证实了预处理模型的优点。最后,我们将panrep-ft应用于Covid-19的新药物。我们展示了通用嵌入在药物重新利用中的优势,并鉴定出在临床试验中使用的几种药物作为可能的候选药物。
Learning unsupervised node embeddings facilitates several downstream tasks such as node classification and link prediction. A node embedding is universal if it is designed to be used by and benefit various downstream tasks. This work introduces PanRep, a graph neural network (GNN) model, for unsupervised learning of universal node representations for heterogenous graphs. PanRep consists of a GNN encoder that obtains node embeddings and four decoders, each capturing different topological and node feature properties. Abiding to these properties the novel unsupervised framework learns universal embeddings applicable to different downstream tasks. PanRep can be furthered fine-tuned to account for possible limited labels. In this operational setting PanRep is considered as a pretrained model for extracting node embeddings of heterogenous graph data. PanRep outperforms all unsupervised and certain supervised methods in node classification and link prediction, especially when the labeled data for the supervised methods is small. PanRep-FT (with fine-tuning) outperforms all other supervised approaches, which corroborates the merits of pretraining models. Finally, we apply PanRep-FT for discovering novel drugs for Covid-19. We showcase the advantage of universal embeddings in drug repurposing and identify several drugs used in clinical trials as possible drug candidates.