在看不见的场景中，域中人群数量计数

论文标题

在看不见的场景中，域中人群数量计数

Domain-General Crowd Counting in Unseen Scenarios

论文作者

Du, Zhipeng, Deng, Jiankang, Shi, Miaojing

论文摘要

域在人群中的转变严重阻碍了人群计数模型以概括为看不见的情况。尽管域自适应人群计数的方法在一定程度上缩小了这一差距，但它们仍取决于目标域数据以适应其模型到特定域。在本文中，我们旨在训练基于单个源域的模型，该模型可以很好地概括在任何看不见的域上。这属于域概括的领域，在人群计数中仍未探索。我们首先引入了动态子域分裂方案，该方案将源域分为多个子域，以便我们可以启动一个元学习框架以进行域概括。在元学习期间，子域分裂是动态精炼的。接下来，为了将图像功能中的特定于域的信息从特定于域的信息中删除域，我们将域 - 不变和特定的人群内存模块设计到重新编写图像特征。设计了两种类型的损失，即特征重建和正交损失，以实现这种解散。对几个标准人群计数基准测试的广泛实验，即SHA，SHB，QNRF和NWPU，显示了我们方法的强烈推广性。

Domain shift across crowd data severely hinders crowd counting models to generalize to unseen scenarios. Although domain adaptive crowd counting approaches close this gap to a certain extent, they are still dependent on the target domain data to adapt (e.g. finetune) their models to the specific domain. In this paper, we aim to train a model based on a single source domain which can generalize well on any unseen domain. This falls into the realm of domain generalization that remains unexplored in crowd counting. We first introduce a dynamic sub-domain division scheme which divides the source domain into multiple sub-domains such that we can initiate a meta-learning framework for domain generalization. The sub-domain division is dynamically refined during the meta-learning. Next, in order to disentangle domain-invariant information from domain-specific information in image features, we design the domain-invariant and -specific crowd memory modules to re-encode image features. Two types of losses, i.e. feature reconstruction and orthogonal losses, are devised to enable this disentanglement. Extensive experiments on several standard crowd counting benchmarks i.e. SHA, SHB, QNRF, and NWPU, show the strong generalizability of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题