论文标题
通过多类图形描述集成网络嵌入和社区异常值检测
Integrating Network Embedding and Community Outlier Detection via Multiclass Graph Description
论文作者
论文摘要
网络(或图)嵌入是将图形节点映射到较低维矢量空间的任务,因此它保留了图形属性并促进了下游网络挖掘任务。现实世界网络通常带有(社区)离群节点,这些节点与社区的常规节点不同。如果不小心处理,这些离群节点可能会影响常规节点的嵌入。在本文中,我们提出了一种新颖的无监督图嵌入方法(称为DMGD),该方法将离群值和社区检测与节点嵌入整合在一起。当给定网络中存在多个社区时,我们将深层支持矢量数据描述的想法扩展到图形嵌入的框架,并且相对于其社区而言,异常值是特征的。我们还显示了DMGD检测到的异常值数量的理论界限。我们的配方归结为在离群值,社区作业和节点嵌入功能之间的一个有趣的最小值游戏中。我们还提出了一种有效的算法来解决此优化框架。与最先进的合成和现实世界网络有关合成和现实世界网络的实验结果表明了我们方法的优点。
Network (or graph) embedding is the task to map the nodes of a graph to a lower dimensional vector space, such that it preserves the graph properties and facilitates the downstream network mining tasks. Real world networks often come with (community) outlier nodes, which behave differently from the regular nodes of the community. These outlier nodes can affect the embedding of the regular nodes, if not handled carefully. In this paper, we propose a novel unsupervised graph embedding approach (called DMGD) which integrates outlier and community detection with node embedding. We extend the idea of deep support vector data description to the framework of graph embedding when there are multiple communities present in the given network, and an outlier is characterized relative to its community. We also show the theoretical bounds on the number of outliers detected by DMGD. Our formulation boils down to an interesting minimax game between the outliers, community assignments and the node embedding function. We also propose an efficient algorithm to solve this optimization framework. Experimental results on both synthetic and real world networks show the merit of our approach compared to state-of-the-arts.