论文标题
增强学习增强了加权抽样,以在完全动态的图流上进行准确的子图计算
Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams
论文作者
论文摘要
随着图形数据的普及,对于各种应用程序,越来越需要计算关注的子图模式的发生。许多图的规模庞大,并且完全动态(带有边缘的插入和缺失),使这些计数的精确计算是不可行的。相反,常见的做法是使用一小部分边缘作为样本来估计计数。完全动态图的现有采样算法以均匀的概率对边进行了样品。在本文中,我们表明,如果我们根据其各个属性进行采样,我们可以做得更好。具体而言,我们提出了一种称为WSD的加权采样算法,用于在完全动态的图流中估算子图计数,该算法根据其权重采样边缘,以表明其重要性并反映其特性。我们使用基于强化学习的新方法以数据驱动的方式确定边缘的权重。我们进行了广泛的实验,以验证与现有算法相比,我们的技术是否可以产生较小的错误,而运行速度通常更快。
As the popularity of graph data increases, there is a growing need to count the occurrences of subgraph patterns of interest, for a variety of applications. Many graphs are massive in scale and also fully dynamic (with insertions and deletions of edges), rendering exact computation of these counts to be infeasible. Common practice is, instead, to use a small set of edges as a sample to estimate the counts. Existing sampling algorithms for fully dynamic graphs sample the edges with uniform probability. In this paper, we show that we can do much better if we sample edges based on their individual properties. Specifically, we propose a weighted sampling algorithm called WSD for estimating the subgraph count in a fully dynamic graph stream, which samples the edges based on their weights that indicate their importance and reflect their properties. We determine the weights of edges in a data-driven fashion, using a novel method based on reinforcement learning. We conduct extensive experiments to verify that our technique can produce estimates with smaller errors while often running faster compared with existing algorithms.