通过渐进补丁学习弱监督的语义细分

论文标题

通过渐进补丁学习弱监督的语义细分

Weakly Supervised Semantic Segmentation via Progressive Patch Learning

论文作者

Li, Jinlong, Jie, Zequn, Wang, Xu, Zhou, Yu, Wei, Xiaolin, Ma, Lin

论文摘要

大多数现有的语义分割方法都以图像级类标签作为监督，高度依赖于从标准分类网络生成的初始类激活图（CAM）。在本文中，提出了一种新颖的“渐进贴片学习”方法，以改善分类的局部细节提取，从而更好地覆盖整个对象的凸轮，而不仅仅是在常规分类模型中获得的cam中的最歧视区域。 “补丁学习”将特征映射破坏成贴片，并在最终聚合之前并行独立处理每个本地贴片。这样的机制强迫网络从分散的歧视性当地部分中找到弱信息，从而提高了当地细节的敏感性。 “渐进的补丁学习”进一步将特征破坏和补丁学习扩展到多层粒度，以渐进的方式。与多阶段优化策略合作，这种“渐进的补丁学习”机制隐式地为模型提供了跨不同位置粒状性的特征提取能力。作为隐式多粒性渐进式融合方法的替代方法，我们还提出了一种明确的方法，以同时将单个模型中不同粒度的特征融合，从而进一步提高了完整对象覆盖的凸轮质量。我们提出的方法在Pascal VOC 2012数据集上取得了出色的性能，例如，测试集的69.6 $％miou）超过了大多数现有的弱监督语义细分方法。代码将在此处公开可用，https://github.com/tyroneli/ppl_wsss。

Most of the existing semantic segmentation approaches with image-level class labels as supervision, highly rely on the initial class activation map (CAM) generated from the standard classification network. In this paper, a novel "Progressive Patch Learning" approach is proposed to improve the local details extraction of the classification, producing the CAM better covering the whole object rather than only the most discriminative regions as in CAMs obtained in conventional classification models. "Patch Learning" destructs the feature maps into patches and independently processes each local patch in parallel before the final aggregation. Such a mechanism enforces the network to find weak information from the scattered discriminative local parts, achieving enhanced local details sensitivity. "Progressive Patch Learning" further extends the feature destruction and patch learning to multi-level granularities in a progressive manner. Cooperating with a multi-stage optimization strategy, such a "Progressive Patch Learning" mechanism implicitly provides the model with the feature extraction ability across different locality-granularities. As an alternative to the implicit multi-granularity progressive fusion approach, we additionally propose an explicit method to simultaneously fuse features from different granularities in a single model, further enhancing the CAM quality on the full object coverage. Our proposed method achieves outstanding performance on the PASCAL VOC 2012 dataset e.g., with 69.6$% mIoU on the test set), which surpasses most existing weakly supervised semantic segmentation methods. Code will be made publicly available here https://github.com/TyroneLi/PPL_WSSS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题