精确的高维渐近学用于量化异质转移

论文标题

精确的高维渐近学用于量化异质转移

Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers

论文作者

Yang, Fan, Zhang, Hongyang R., Wu, Sen, Ré, Christopher, Su, Weijie J.

论文摘要

使用另一个任务的样本学习一项任务的问题是转移学习的核心。在本文中，我们专注于回答以下问题：何时将两个相关任务的样本组合起来比单独学习一个目标任务更好？这个问题是由一种被称为负转移的经验现象激发的，在实践中已经观察到。尽管从一个任务到另一个任务的转移效果取决于其样本量和协方差矩阵的频谱等因素，但精确量化了这种依赖性仍然是一个挑战性的问题。为了将转移学习估计器与单任务学习进行比较，需要将两个估计器之间的风险进行比较。此外，比较取决于两个任务之间的分布变化。本文应用了随机矩阵理论的最新发展，以在具有两个任务的高维线性回归设置中应对这一挑战。我们在比例极限下显示了经典硬参数共享（HPS）估计量的偏差和方差的精确高维渐近学，其中两个任务的样本大小随固定比率时的维度成比例地增加。精确的渐近造型适用于各种类型的分布变化，包括两者的协变量，模型移位和组合。我们以随机效应模型来说明这些结果，以数学上证明随着源任务样本数量的增加，从正向转移到负转移。分析中的一个见解是，重新平衡的HPS估计器，该估计器在模型移动较高时量化了源任务，可实现最小值最佳速率。关于相变的发现也适用于多个任务，当时协变量分别跨任务共享。模拟验证了有限维度的高维渐近线的准确性。

The problem of learning one task using samples from another task is central to transfer learning. In this paper, we focus on answering the following question: when does combining the samples from two related tasks perform better than learning with one target task alone? This question is motivated by an empirical phenomenon known as negative transfer, which has been observed in practice. While the transfer effect from one task to another depends on factors such as their sample sizes and the spectrum of their covariance matrices, precisely quantifying this dependence has remained a challenging problem. In order to compare a transfer learning estimator to single-task learning, one needs to compare the risks between the two estimators precisely. Further, the comparison depends on the distribution shifts between the two tasks. This paper applies recent developments of random matrix theory to tackle this challenge in a high-dimensional linear regression setting with two tasks. We show precise high-dimensional asymptotics for the bias and variance of a classical hard parameter sharing (HPS) estimator in the proportional limit, where the sample sizes of both tasks increase proportionally with dimension at fixed ratios. The precise asymptotics apply to various types of distribution shifts, including covariate shifts, model shifts, and combinations of both. We illustrate these results in a random-effects model to mathematically prove a phase transition from positive to negative transfer as the number of source task samples increases. One insight from the analysis is that a rebalanced HPS estimator, which downsizes the source task when the model shift is high, achieves the minimax optimal rate. The finding regarding phase transition also applies to multiple tasks when covariates are shared across tasks. Simulations validate the accuracy of the high-dimensional asymptotics for finite dimensions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题