论文标题
链接预测中拓扑特征的最大能力
The maximum capability of a topological feature in link prediction
论文作者
论文摘要
网络通过表示基础的成对交互作用,为建模复杂系统提供了强大的方法。链接预测是预测无法直接可见的网络的链接的任务,并在生物,社会和其他复杂系统中采用了深刻的应用。尽管该任务中拓扑特征的大量利用,但尚不清楚可以在多大程度上利用一项功能来推断丢失的链接。在这里,我们的目标是通过确定其预测性能上限,以揭示拓扑特征在链接预测中的能力。我们引入了一个与不同索引兼容的理论框架,以衡量特征,使用该功能的不同预测方法以及不同的指标来量化预测性能。拓扑特征的最大功能遵循一个简单但经过理论上验证的表达式,这仅取决于在缺失和不存在的链接中保留该功能的程度。由于基于同一特征的索引家族具有相同的上限,因此所有其他索引的潜力都可以从一个单个索引中估算出来。此外,在监督预测中提高了功能的功能,该预测可以在数学上进行量化,从而使我们能够估计应用机器学习算法的好处。 550个结构上多样化的网络在经验上验证了所发现的模式的普遍性。这些发现在功能和方法选择中具有应用,并阐明了网络特征,使拓扑特征有效地有效。
Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature's capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.