论文标题

使用加权采样和标签平滑的分层数据学习的星系图像分类

Galaxy Image Classification using Hierarchical Data Learning with Weighted Sampling and Label Smoothing

论文作者

Ma, Xiaohua, Li, Xiangru, Luo, Ali, Zhang, Jinqu, Li, Hui

论文摘要

近年来,随着一系列星系天空调查的发展,观察结果迅速增加,这使得对Galaxy图像识别的机器学习方法的研究成为热门话题。可用的自动星系图像识别研究受到类别之间相似性的巨大差异,不同类别之间的数据失衡以及星系类别的离散表示之间的差异以及从一个形态类别到邻近类(DDRGC)的基本逐步变化所困扰的。这些限制激发了几位天文学家和机器学习专家,以改进的星系图像识别能力设计项目。 Therefore, this paper proposes a novel learning method, ``Hierarchical Imbalanced data learning with Weighted sampling and Label smoothing" (HIWL). The HIWL consists of three key techniques respectively dealing with the above-mentioned three problems: (1) Designed a hierarchical galaxy classification model based on an efficient backbone network; (2) Utilized a weighted sampling scheme to deal with the imbalance problem; (3)采用标签平滑技术来减轻DDRGC问题。作品。此外,我们还探索了星系图像特征的可视化和模型的关注,以了解所提出的方案的基础。

With the development of a series of Galaxy sky surveys in recent years, the observations increased rapidly, which makes the research of machine learning methods for galaxy image recognition a hot topic. Available automatic galaxy image recognition researches are plagued by the large differences in similarity between categories, the imbalance of data between different classes, and the discrepancy between the discrete representation of Galaxy classes and the essentially gradual changes from one morphological class to the adjacent class (DDRGC). These limitations have motivated several astronomers and machine learning experts to design projects with improved galaxy image recognition capabilities. Therefore, this paper proposes a novel learning method, ``Hierarchical Imbalanced data learning with Weighted sampling and Label smoothing" (HIWL). The HIWL consists of three key techniques respectively dealing with the above-mentioned three problems: (1) Designed a hierarchical galaxy classification model based on an efficient backbone network; (2) Utilized a weighted sampling scheme to deal with the imbalance problem; (3) Adopted a label smoothing technique to alleviate the DDRGC problem. We applied this method to galaxy photometric images from the Galaxy Zoo-The Galaxy Challenge, exploring the recognition of completely round smooth, in between smooth, cigar-shaped, edge-on and spiral. The overall classification accuracy is 96.32\%, and some superiorities of the HIWL are shown based on recall, precision, and F1-Score in comparing with some related works. In addition, we also explored the visualization of the galaxy image features and model attention to understand the foundations of the proposed scheme.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源