实例为身份：视频实例细分的通用在线范例

论文标题

实例为身份：视频实例细分的通用在线范例

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation

论文作者

Zhu, Feng, Yang, Zongxin, Yu, Xin, Yang, Yi, Wei, Yunchao

论文摘要

在统一框架中为检测和跟踪进行建模已被证明是视频实例分割（VIS）的有希望的解决方案。但是，如何有效地将时间信息纳入在线模型仍然是一个空旷的问题。在这项工作中，我们提出了一个名为Inspecity（IAI）的新的在线Vis范式，该范式以有效的方式对检测和跟踪的时间信息进行建模。详细说明，IAI采用了一个新颖的识别模块来明确预测跟踪实例的标识号。为了传递时间信息跨框架，IAI使用了结合当前特征和过去嵌入的关联模块。值得注意的是，IAI可以与不同的图像模型集成。我们对三个基准测试进行了广泛的实验。 IAI的表现优于YouTube-VIS-VIS-2019（RESNET-101 43.7地图）和YouTube-VIS-2021（Resnet-50 38.0地图）上的所有在线竞争对手。令人惊讶的是，在更具挑战性的OVI上，IAI实现了Sota性能（20.6地图）。代码可从https://github.com/zfonemore/iai获得

Modeling temporal information for both detection and tracking in a unified framework has been proved a promising solution to video instance segmentation (VIS). However, how to effectively incorporate the temporal information into an online model remains an open problem. In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way. In detail, IAI employs a novel identification module to predict identification number for tracking instances explicitly. For passing temporal information cross frame, IAI utilizes an association module which combines current features and past embeddings. Notably, IAI can be integrated with different image models. We conduct extensive experiments on three VIS benchmarks. IAI outperforms all the online competitors on YouTube-VIS-2019 (ResNet-101 43.7 mAP) and YouTube-VIS-2021 (ResNet-50 38.0 mAP). Surprisingly, on the more challenging OVIS, IAI achieves SOTA performance (20.6 mAP). Code is available at https://github.com/zfonemore/IAI

下载PDF全文

下载文献需遵守相关版权规定

论文标题