论文标题
可靠的域适应性
Slimmable Domain Adaptation
论文作者
论文摘要
香草无监督的域适应方法倾向于通过固定的神经体系结构优化模型,在现实世界中,这不是很实际,因为目标数据通常由不同的资源有限的设备处理。因此,有必要促进各种设备的体系结构适应。在本文中,我们引入了一个简单的框架,可靠的域名适应性,以通过重量分担模型库来改善跨域泛化,从中可以采样不同能力的模型,以适应不同的准确性效率折衷。该框架中的主要挑战在于同时提高模型库中众多模型的适应性。为了解决这个问题,我们开发了一种随机的集合蒸馏方法,以充分利用模型库中模型间相互作用的互补知识。然而,考虑到模型间相互作用与模型适应之间的优化冲突,我们将现有的Bi-Clal-Clal-Clal-clameifier域混淆体系结构扩大到优化分离的三个分类器对应物中。优化模型库后,通过我们提出的无监督性能评估指标来利用体系结构的适应。在各种资源限制下,我们的框架超过了其他竞争方法,在多个基准测试基准上的利润很大。还值得强调的是,即使计算复杂性降低到$ 1/64 $,我们的框架也可以保护仅源模型的性能提高。代码将在https://github.com/hikvision-research/slimda上找到。
Vanilla unsupervised domain adaptation methods tend to optimize the model with fixed neural architecture, which is not very practical in real-world scenarios since the target data is usually processed by different resource-limited devices. It is therefore of great necessity to facilitate architecture adaptation across various devices. In this paper, we introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank, from which models of different capacities can be sampled to accommodate different accuracy-efficiency trade-offs. The main challenge in this framework lies in simultaneously boosting the adaptation performance of numerous models in the model bank. To tackle this problem, we develop a Stochastic EnsEmble Distillation method to fully exploit the complementary knowledge in the model bank for inter-model interaction. Nevertheless, considering the optimization conflict between inter-model interaction and intra-model adaptation, we augment the existing bi-classifier domain confusion architecture into an Optimization-Separated Tri-Classifier counterpart. After optimizing the model bank, architecture adaptation is leveraged via our proposed Unsupervised Performance Evaluation Metric. Under various resource constraints, our framework surpasses other competing approaches by a very large margin on multiple benchmarks. It is also worth emphasizing that our framework can preserve the performance improvement against the source-only model even when the computing complexity is reduced to $1/64$. Code will be available at https://github.com/hikvision-research/SlimDA.