论文标题
$ k $ - 持续同源的群集聚类
$k$-Means Clustering for Persistent Homology
论文作者
论文摘要
持续性同源性是拓扑数据分析中心的一种方法,它提取并总结了数据集中的拓扑特征作为持续图;它最近从无数的成功应用到许多领域都广受欢迎。但是,其代数结构以高度复杂的几何形状引起了持久图的度量空间。在本文中,我们证明了$ k $ - 均值在持久图空间上的聚类算法的收敛性,并在Karush-kuhn--tucker Framework中建立解决方案的理论属性。此外,我们对持续同源性的各种表示进行数值实验,包括持久图的嵌入以及图表本身以及它们作为持久性措施的概括。我们发现,$ k $ -Means的聚类性能直接在持久图上,衡量表现优于其矢量表示。
Persistent homology is a methodology central to topological data analysis that extracts and summarizes the topological features within a dataset as a persistence diagram; it has recently gained much popularity from its myriad successful applications to many domains. However, its algebraic construction induces a metric space of persistence diagrams with a highly complex geometry. In this paper, we prove convergence of the $k$-means clustering algorithm on persistence diagram space and establish theoretical properties of the solution to the optimization problem in the Karush--Kuhn--Tucker framework. Additionally, we perform numerical experiments on various representations of persistent homology, including embeddings of persistence diagrams as well as diagrams themselves and their generalizations as persistence measures; we find that $k$-means clustering performance directly on persistence diagrams and measures outperform their vectorized representations.