特征选择性变压器用于语义图像分割

论文标题

特征选择性变压器用于语义图像分割

Feature Selective Transformer for Semantic Image Segmentation

论文作者

Lin, Fangjian, Wu, Tianyi, Wu, Sitong, Tian, Shengwei, Guo, Guodong

论文摘要

最近，它吸引了越来越多的关注，以融合多尺度图像分割的多尺度功能。提出了各种作品来采用渐进式的本地或全球融合，但是功能融合不足以建模多尺度上下文特征。在这项工作中，我们专注于融合来自基于变压器的主机的多尺度功能，以进行语义分割，并提出了一个功能选择性变压器（FESEFORMER），该功能构成了每个查询功能的所有尺度（或级别）的特征。具体而言，我们首先提出了一个比例级特征选择（SFS）模块，该模块可以从整个多尺度功能集中选择一个信息的子集，其中选择了对于当前尺度（或级别）重要的那些特征，并且丢弃了冗余。此外，我们提出了一个全尺度特征融合（FFF）模块，该模块可以适应所有尺度的查询。根据拟议的SFS和FFF模块，我们开发了一个功能选择性变压器（FESEFORMER），并在四个具有挑战性的语义分割基准上评估我们的FESEFORMER，包括Pascal环境，ADE20K，Coco-Stuff 10K和CityScapes，超过了州的范围。

Recently, it has attracted more and more attentions to fuse multi-scale features for semantic image segmentation. Various works were proposed to employ progressive local or global fusion, but the feature fusions are not rich enough for modeling multi-scale context features. In this work, we focus on fusing multi-scale features from Transformer-based backbones for semantic segmentation, and propose a Feature Selective Transformer (FeSeFormer), which aggregates features from all scales (or levels) for each query feature. Specifically, we first propose a Scale-level Feature Selection (SFS) module, which can choose an informative subset from the whole multi-scale feature set for each scale, where those features that are important for the current scale (or level) are selected and the redundant are discarded. Furthermore, we propose a Full-scale Feature Fusion (FFF) module, which can adaptively fuse features of all scales for queries. Based on the proposed SFS and FFF modules, we develop a Feature Selective Transformer (FeSeFormer), and evaluate our FeSeFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题