SCGC：自我监管的对比图集群

论文标题

SCGC：自我监管的对比图集群

SCGC : Self-Supervised Contrastive Graph Clustering

论文作者

Kulatilleke, Gayan K., Portmann, Marius, Chandra, Shekhar S.

论文摘要

图聚类发现网络中的群体或社区。诸如自动编码器（AE）之类的深度学习方法提取有效的聚类和下游表示，但无法纳入丰富的结构信息。尽管图神经网络（GNN）在编码图形结构方面取得了巨大的成功，但基于卷积或注意力变体的典型GNN遭受过度平滑，噪声，异质性的损失，在计算上昂贵，并且通常需要存在完整的图。取而代之的是，我们提出了自我监督的对比图聚类（SCGC），该图通过对比度损耗信号施加图形结构，以学习判别节点表示和迭代完善的软群集标签。我们还提出了SCGC*，具有更有效，新颖的影响增强对比度（IAC）损失（IAC）对融合更丰富的结构信息，而原始模型参数的一半。 SCGC（*）使用简单的线性单元更快，完全消除了传统GNN的卷积和注意力，但有效地结合了结构。对于过度光滑，不正确的边缘和异质性，深度和健壮的层面是不可渗的。它可以通过批处理，在许多先前的GNN模型中的限制以及可平行的。我们在广泛的基准图数据集上获得了对最先进的改进，包括图像，传感器数据，文本和引用网络有效。具体而言，DBLP的ARI为20％，NMI为18％；总体上减少了55％的训练时间和总体上减少的推理时间减少了81％。我们的代码可在以下网址找到：https：//github.com/gayanku/scgc

Graph clustering discovers groups or communities within networks. Deep learning methods such as autoencoders (AE) extract effective clustering and downstream representations but cannot incorporate rich structural information. While Graph Neural Networks (GNN) have shown great success in encoding graph structure, typical GNNs based on convolution or attention variants suffer from over-smoothing, noise, heterophily, are computationally expensive and typically require the complete graph being present. Instead, we propose Self-Supervised Contrastive Graph Clustering (SCGC), which imposes graph-structure via contrastive loss signals to learn discriminative node representations and iteratively refined soft cluster labels. We also propose SCGC*, with a more effective, novel, Influence Augmented Contrastive (IAC) loss to fuse richer structural information, and half the original model parameters. SCGC(*) is faster with simple linear units, completely eliminate convolutions and attention of traditional GNNs, yet efficiently incorporates structure. It is impervious to layer depth and robust to over-smoothing, incorrect edges and heterophily. It is scalable by batching, a limitation in many prior GNN models, and trivially parallelizable. We obtain significant improvements over state-of-the-art on a wide range of benchmark graph datasets, including images, sensor data, text, and citation networks efficiently. Specifically, 20% on ARI and 18% on NMI for DBLP; overall 55% reduction in training time and overall, 81% reduction on inference time. Our code is available at : https://github.com/gayanku/SCGC

下载PDF全文

下载文献需遵守相关版权规定

论文标题