论文标题

特征选择性变压器用于语义图像分割

Feature Selective Transformer for Semantic Image Segmentation

论文作者

Lin, Fangjian, Wu, Tianyi, Wu, Sitong, Tian, Shengwei, Guo, Guodong

论文摘要

最近,它吸引了越来越多的关注,以融合多尺度图像分割的多尺度功能。提出了各种作品来采用渐进式的本地或全球融合,但是功能融合不足以建模多尺度上下文特征。在这项工作中,我们专注于融合来自基于变压器的主机的多尺度功能,以进行语义分割,并提出了一个功能选择性变压器(FESEFORMER),该功能构成了每个查询功能的所有尺度(或级别)的特征。具体而言,我们首先提出了一个比例级特征选择(SFS)模块,该模块可以从整个多尺度功能集中选择一个信息的子集,其中选择了对于当前尺度(或级别)重要的那些特征,并且丢弃了冗余。此外,我们提出了一个全尺度特征融合(FFF)模块,该模块可以适应所有尺度的查询。根据拟议的SFS和FFF模块,我们开发了一个功能选择性变压器(FESEFORMER),并在四个具有挑战性的语义分割基准上评估我们的FESEFORMER,包括Pascal环境,ADE20K,Coco-Stuff 10K和CityScapes,超过了州的范围。

Recently, it has attracted more and more attentions to fuse multi-scale features for semantic image segmentation. Various works were proposed to employ progressive local or global fusion, but the feature fusions are not rich enough for modeling multi-scale context features. In this work, we focus on fusing multi-scale features from Transformer-based backbones for semantic segmentation, and propose a Feature Selective Transformer (FeSeFormer), which aggregates features from all scales (or levels) for each query feature. Specifically, we first propose a Scale-level Feature Selection (SFS) module, which can choose an informative subset from the whole multi-scale feature set for each scale, where those features that are important for the current scale (or level) are selected and the redundant are discarded. Furthermore, we propose a Full-scale Feature Fusion (FFF) module, which can adaptively fuse features of all scales for queries. Based on the proposed SFS and FFF modules, we develop a Feature Selective Transformer (FeSeFormer), and evaluate our FeSeFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源