论文标题

基于张量的素描方法,用于数据流的低级别近似值

Tensor-Based Sketching Method for the Low-Rank Approximation of Data Streams

论文作者

Liu, Cuiyu, Xiao, Chuanfu, Ding, Mingshuo, Yang, Chao

论文摘要

数据流中的低级别近似是计算科学,机器学习和统计数据的基本和重要任务。多年来已经出现了多种流算法,其中大多数是受随机算法的启发,更具体地说,是素描方法。但是,许多算法无法利用数据流的信息,因此遭受了较低的精度。现有的数据驱动方法提高了准确性,但培训成本在实践中却很昂贵。在本文中,从子空间的角度来看,我们提出了一种基于张量的素描方法,用于数据流的低级别近似值。所提出的算法充分利用了数据流的结构,并通过对训练数据进行张量分解来获得准最佳素描矩阵。进行了一系列实验,并表明所提出的基于张量的方法比以前的工作更准确,更快。

Low-rank approximation in data streams is a fundamental and significant task in computing science, machine learning and statistics. Multiple streaming algorithms have emerged over years and most of them are inspired by randomized algorithms, more specifically, sketching methods. However, many algorithms are not able to leverage information of data streams and consequently suffer from low accuracy. Existing data-driven methods improve accuracy but the training cost is expensive in practice. In this paper, from a subspace perspective, we propose a tensor-based sketching method for low-rank approximation of data streams. The proposed algorithm fully exploits the structure of data streams and obtains quasi-optimal sketching matrices by performing tensor decomposition on training data. A series of experiments are carried out and show that the proposed tensor-based method can be more accurate and much faster than the previous work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源