论文标题

$ k $ - 持续同源的群集聚类

$k$-Means Clustering for Persistent Homology

论文作者

Cao, Yueqi, Leung, Prudence, Monod, Anthea

论文摘要

持续性同源性是拓扑数据分析中心的一种方法,它提取并总结了数据集中的拓扑特征作为持续图;它最近从无数的成功应用到许多领域都广受欢迎。但是,其代数结构以高度复杂的几何形状引起了持久图的度量空间。在本文中,我们证明了$ k $ - 均值在持久图空间上的聚类算法的收敛性,并在Karush-kuhn--tucker Framework中建立解决方案的理论属性。此外,我们对持续同源性的各种表示进行数值实验,包括持久图的嵌入以及图表本身以及它们作为持久性措施的概括。我们发现,$ k $ -Means的聚类性能直接在持久图上,衡量表现优于其矢量表示。

Persistent homology is a methodology central to topological data analysis that extracts and summarizes the topological features within a dataset as a persistence diagram; it has recently gained much popularity from its myriad successful applications to many domains. However, its algebraic construction induces a metric space of persistence diagrams with a highly complex geometry. In this paper, we prove convergence of the $k$-means clustering algorithm on persistence diagram space and establish theoretical properties of the solution to the optimization problem in the Karush--Kuhn--Tucker framework. Additionally, we perform numerical experiments on various representations of persistent homology, including embeddings of persistence diagrams as well as diagrams themselves and their generalizations as persistence measures; we find that $k$-means clustering performance directly on persistence diagrams and measures outperform their vectorized representations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源