论文标题
朝着全方位知识转移:从任务 - 近乎学习的标签中学习
Towards All-around Knowledge Transferring: Learning From Task-irrelevant Labels
论文作者
论文摘要
迄今为止,深层神经模型在众多分类任务上取得了重大性能,但同时需要足够的手动注释数据。由于为每个分类任务注释足够的数据非常耗时且昂贵,因此在小型数据集上学习具有概括的模型,因此受到了越来越多的关注。现有的努力主要集中于将与任务相关的知识从其他类似数据转移到解决问题上。这些方法取得了显着的改进,但忽略了一个事实,即任务核定功能可以带来巨大的负转移效应。迄今为止,尚未进行大规模研究来研究任务 - 核定功能的影响,更不用说这种功能的利用了。在本文中,我们首先提出了任务 - 近报转移学习(TIRTL)来利用任务 - iRrelevant功能,这些功能主要是从任务 - iRrelevant标签中提取的。特别是,我们抑制了任务 - 无关信息的表达,并促进了分类的学习过程。我们还提供了我们方法的理论解释。此外,TIRTL与先前利用与任务相关的知识的人没有冲突,并且可以很好地结合使用,以便能够同时利用与任务相关的和任务irrrelevant的功能。为了验证我们的理论和方法的有效性,我们就面部表达识别和数字识别任务进行了广泛的实验。我们的源代码将来也将用于可重复性。
Deep neural models have hitherto achieved significant performances on numerous classification tasks, but meanwhile require sufficient manually annotated data. Since it is extremely time-consuming and expensive to annotate adequate data for each classification task, learning an empirically effective model with generalization on small dataset has received increased attention. Existing efforts mainly focus on transferring task-relevant knowledge from other similar data to tackle the issue. These approaches have yielded remarkable improvements, yet neglecting the fact that the task-irrelevant features could bring out massive negative transfer effects. To date, no large-scale studies have been performed to investigate the impact of task-irrelevant features, let alone the utilization of this kind of features. In this paper, we firstly propose Task-Irrelevant Transfer Learning (TIRTL) to exploit task-irrelevant features, which mainly are extracted from task-irrelevant labels. Particularly, we suppress the expression of task-irrelevant information and facilitate the learning process of classification. We also provide a theoretical explanation of our method. In addition, TIRTL does not conflict with those that have previously exploited task-relevant knowledge and can be well combined to enable the simultaneous utilization of task-relevant and task-irrelevant features for the first time. In order to verify the effectiveness of our theory and method, we conduct extensive experiments on facial expression recognition and digit recognition tasks. Our source code will be also available in the future for reproducibility.