论文标题
部分域适应方法的可再现和现实评估
A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
论文作者
论文摘要
无监督的域适应(UDA)旨在分类未标记的目标图像,以利用标记为源的目标图像。在这项工作中,我们考虑了部分域适应(PDA)变体,其中我们在目标域中不存在额外的源类。大多数成功的算法都使用模型选择策略,这些策略依靠目标标签来在培训中找到最佳的超参数和/或模型。但是,这些策略违反了PDA中的主要假设:仅可用的未标记目标域样本。此外,实验环境中也存在不一致之处 - 体系结构,超参数调整,运行次数 - 产生不公平的比较。这项工作的主要目的是在一致的评估协议下使用不同的模型选择策略提供对PDA方法的现实评估。我们使用7种不同的模型选择策略评估了2种不同现实世界数据集上的7种代表性PDA算法。我们的两个主要发现是:(i)没有目标标签用于模型选择,方法的准确性可降低30个百分点; (ii)在两个数据集上,只有一种方法和模型选择对表现良好。使用我们的Pytorch框架BenchmarkPDA进行了实验,我们开了。
Unsupervised Domain Adaptation (UDA) aims at classifying unlabeled target images leveraging source labeled ones. In this work, we consider the Partial Domain Adaptation (PDA) variant, where we have extra source classes not present in the target domain. Most successful algorithms use model selection strategies that rely on target labels to find the best hyper-parameters and/or models along training. However, these strategies violate the main assumption in PDA: only unlabeled target domain samples are available. Moreover, there are also inconsistencies in the experimental settings - architecture, hyper-parameter tuning, number of runs - yielding unfair comparisons. The main goal of this work is to provide a realistic evaluation of PDA methods with the different model selection strategies under a consistent evaluation protocol. We evaluate 7 representative PDA algorithms on 2 different real-world datasets using 7 different model selection strategies. Our two main findings are: (i) without target labels for model selection, the accuracy of the methods decreases up to 30 percentage points; (ii) only one method and model selection pair performs well on both datasets. Experiments were performed with our PyTorch framework, BenchmarkPDA, which we open source.