论文标题
部分可观测时空混沌系统的无模型预测
Adaptive Sketches for Robust Regression with Importance Sampling
论文作者
论文摘要
我们通过对概率与其标准成正比的概率(即重要性采样)进行采样梯度来引入数据结构,以通过随机梯度下降(SGD)解决鲁棒回归。尽管SGD被广泛用于大规模机器学习,但由于均匀抽样的较高差异,它可能会经历缓慢的收敛速率而闻名。另一方面,重要性采样可以显着降低差异,但通常很难实施,因为计算采样概率需要对数据进行额外的通过,在这种情况下,可以使用标准梯度下降(GD)。在本文中,我们介绍了一种算法,该算法大约从$ d $ $ d $的$ t梯度中,几乎从$ n $行上的可靠回归问题的最佳重要性采样分布中进行了$ d $。因此,我们的算法在使用sublinear空间时有效地运行了$ T $ SGD的$ T $步骤,并仅通过单一通过数据。我们的技术还扩展到对二阶优化的重要性采样。
We introduce data structures for solving robust regression through stochastic gradient descent (SGD) by sampling gradients with probability proportional to their norm, i.e., importance sampling. Although SGD is widely used for large scale machine learning, it is well-known for possibly experiencing slow convergence rates due to the high variance from uniform sampling. On the other hand, importance sampling can significantly decrease the variance but is usually difficult to implement because computing the sampling probabilities requires additional passes over the data, in which case standard gradient descent (GD) could be used instead. In this paper, we introduce an algorithm that approximately samples $T$ gradients of dimension $d$ from nearly the optimal importance sampling distribution for a robust regression problem over $n$ rows. Thus our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data. Our techniques also extend to performing importance sampling for second-order optimization.