低成本粗注释的城市场景语义细分

论文标题

低成本粗注释的城市场景语义细分

Urban Scene Semantic Segmentation with Low-Cost Coarse Annotation

论文作者

Das, Anurag, Xian, Yongqin, He, Yang, Akata, Zeynep, Schiele, Bernt

论文摘要

为了获得最佳性能，当今的语义细分方法使用大型且精心贴上的数据集，需要昂贵的注释预算。在这项工作中，我们表明粗略注释是训练语义分割模型的低成本但高效的替代方案。考虑到城市场景细分方案，我们利用廉价的粗糙注释来用于现实世界中捕获的数据，以及合成数据来训练我们的模型并显示出与精心注释的现实世界数据相比，表现出竞争性的性能。具体而言，我们提出了一个粗到十五的自我训练框架，该框架使用合成数据生成伪标签，以用于未标记的注释数据的未标记区域，以改善语义类别之间的界限周围的预测，并使用跨域数据增强来提高多样性。我们对CityScapes和BDD100K数据集的广泛实验结果表明，我们的方法可以取得更好的性能与注释成本折衷，从而获得与完全注释的数据相当的性能，只有一小部分注释预算。另外，与标准完全监督的设置相比，当用作预训练时，我们的框架表现更好。

For best performance, today's semantic segmentation methods use large and carefully labeled datasets, requiring expensive annotation budgets. In this work, we show that coarse annotation is a low-cost but highly effective alternative for training semantic segmentation models. Considering the urban scene segmentation scenario, we leverage cheap coarse annotations for real-world captured data, as well as synthetic data to train our model and show competitive performance compared with finely annotated real-world data. Specifically, we propose a coarse-to-fine self-training framework that generates pseudo labels for unlabeled regions of the coarsely annotated data, using synthetic data to improve predictions around the boundaries between semantic classes, and using cross-domain data augmentation to increase diversity. Our extensive experimental results on Cityscapes and BDD100k datasets demonstrate that our method achieves a significantly better performance vs annotation cost tradeoff, yielding a comparable performance to fully annotated data with only a small fraction of the annotation budget. Also, when used as pretraining, our framework performs better compared to the standard fully supervised setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题