论文标题

拓扑数据分析的入门,以支持环境科学中的图像分析任务

A Primer on Topological Data Analysis to Support Image Analysis Tasks in Environmental Science

论文作者

Hoef, Lander Ver, Adams, Henry, King, Emily J., Ebert-Uphoff, Imme

论文摘要

拓扑数据分析(TDA)是来自数据科学和数学的工具,它开始在环境科学领域引起波浪。在这项工作中,我们寻求对TDA工具的直观且可理解的介绍,该工具对于分析图像(即持续存在同源性)特别有用。我们简要讨论理论背景,但主要关注理解该工具的输出并讨论它可以收集的信息。为此,我们围绕着一个指导示例进行讨论,该指导示例是对RASP等人研究的糖,鱼类,花朵和砾石数据集进行分类。 al。 2020年(Arxiv:1906:01906)。我们证明了如何使用简单的机器学习算法来获得良好的结果,并详细探讨如何通过图像级特征来解释这种行为,并探索如何使用简单的机器学习算法来使用持久的同源性及其矢量化,持久性景观。持续同源性的核心优势之一是它的解释性可解释,因此在本文中,我们不仅讨论了我们发现的模式,而且要考虑到为什么我们对持久同源性理论的了解,因此可以期待这些结果。我们的目标是,本文的读者将更好地了解TDA和持续的同源性,能够确定自己的问题和数据集,为此持续的同源性可能会有所帮助,并从应用随附的GitHub示例代码中获得对结果的理解。

Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the Sugar, Fish, Flower, and Gravel Dataset produced for the study of mesocale organization of clouds by Rasp et. al. in 2020 (arXiv:1906:01906). We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find, but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that a reader of this paper will leave with a better understanding of TDA and persistent homology, be able to identify problems and datasets of their own for which persistent homology could be helpful, and gain an understanding of results they obtain from applying the included GitHub example code.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源