渐进无损图摘要

论文标题

渐进无损图摘要

Incremental Lossless Graph Summarization

论文作者

Ko, Jihoon, Kook, Yunbum, Shin, Kijung

论文摘要

给定一个完全动态的图，表示为边缘插入和删除的流，我们如何获得并逐步更新其当前快照的无损摘要？由于大规模的图表很普遍，因此不可避免地代表它们以进行有效的存储和分析。无损图摘要是具有许多理想特性的有效图形压缩技术。它的目的是将输入图表示为（a）由超节点（即节点的集合）和超级（即超节点之间的边缘）组成的摘要图，这些图提供了粗略的描述，并且（b）边缘更正固定了由粗略描述诱导的误差。尽管已经开发了许多适合静态图的批次算法用于快速和紧凑的图形摘要，但在动态图的时间和空间方面，它们在实践中很常见。在这项工作中，我们提出了Mosso，这是第一个用于完全动态图的无损汇总的增量算法。为了响应输入图中的每个更改，Mosso通过在超节点之间反复移动节点来更新输出表示。 Mosso决定要搬家，并根据几个新颖的想法仔细但迅速的目的地。 Through extensive experiments on 10 real graphs, we show MoSSo is (a) Fast and 'any time': processing each change in near-constant time (less than 0.1 millisecond), up to 7 orders of magnitude faster than running state-of-the-art batch methods, (b) Scalable: summarizing graphs with hundreds of millions of edges, requiring sub-linear memory during the process, and (c) Effective: achieving comparable compression比率甚至是最先进的批处理方法。

Given a fully dynamic graph, represented as a stream of edge insertions and deletions, how can we obtain and incrementally update a lossless summary of its current snapshot? As large-scale graphs are prevalent, concisely representing them is inevitable for efficient storage and analysis. Lossless graph summarization is an effective graph-compression technique with many desirable properties. It aims to compactly represent the input graph as (a) a summary graph consisting of supernodes (i.e., sets of nodes) and superedges (i.e., edges between supernodes), which provide a rough description, and (b) edge corrections which fix errors induced by the rough description. While a number of batch algorithms, suited for static graphs, have been developed for rapid and compact graph summarization, they are highly inefficient in terms of time and space for dynamic graphs, which are common in practice. In this work, we propose MoSSo, the first incremental algorithm for lossless summarization of fully dynamic graphs. In response to each change in the input graph, MoSSo updates the output representation by repeatedly moving nodes among supernodes. MoSSo decides nodes to be moved and their destinations carefully but rapidly based on several novel ideas. Through extensive experiments on 10 real graphs, we show MoSSo is (a) Fast and 'any time': processing each change in near-constant time (less than 0.1 millisecond), up to 7 orders of magnitude faster than running state-of-the-art batch methods, (b) Scalable: summarizing graphs with hundreds of millions of edges, requiring sub-linear memory during the process, and (c) Effective: achieving comparable compression ratios even to state-of-the-art batch methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题