论文标题

Lazyfox:大图中的快速和并行重叠的社区检测

LazyFox: Fast and parallelized overlapping community detection in large graphs

论文作者

Garrels, Tim, Khodabakhsh, Athar, Renard, Bernhard Y., Baum, Katharina

论文摘要

图数据集中社区的检测提供了有关图形基础结构的见解,并且是社会科学,营销,流量预测和药物发现等各种领域的重要工具。尽管大多数现有的算法都提供了快速的社区检测方法,但它们的结果通常包含严格分离的社区。但是,大多数数据集在语义上允许甚至需要重叠的社区,这些社区只能以更高的计算成本确定。我们基于一种有效的算法Fox,该算法检测到这种重叠的社区。福克斯通过近似与该社区形成的三角形的计数来衡量一个节点与社区的亲密关系。我们提出了LazyFox,这是FOX算法的多线程版本,它提供了更快的检测,而不会影响社区质量。这允许分析明显更大,更复杂的数据集。 Lazyfox在复杂的图形数据集上具有重叠的社区检测,这些数据集在几天而不是几周内具有数百万个节点和数十亿个边缘。作为这项工作的一部分,LazyFox的实施已发布,可在https://github.com/timgarrels/lazyfox的MIT许可下作为工具提供。

The detection of communities in graph datasets provides insight about a graph's underlying structure and is an important tool for various domains such as social sciences, marketing, traffic forecast, and drug discovery. While most existing algorithms provide fast approaches for community detection, their results usually contain strictly separated communities. However, most datasets would semantically allow for or even require overlapping communities that can only be determined at much higher computational cost. We build on an efficient algorithm, Fox, that detects such overlapping communities. Fox measures the closeness of a node to a community by approximating the count of triangles which that node forms with that community. We propose LazyFox, a multi-threaded version of the Fox algorithm, which provides even faster detection without an impact on community quality. This allows for the analyses of significantly larger and more complex datasets. LazyFox enables overlapping community detection on complex graph datasets with millions of nodes and billions of edges in days instead of weeks. As part of this work, LazyFox's implementation was published and is available as a tool under an MIT licence at https://github.com/TimGarrels/LazyFox.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源