论文标题

Sparten高性能稀疏张量分解软件的参数灵敏度分析:扩展分析

Parameter Sensitivity Analysis of the SparTen High Performance Sparse Tensor Decomposition Software: Extended Analysis

论文作者

Myers, Jeremy M., Dunlavy, Daniel M., Teranishi, Keita, Hollman, D. S.

论文摘要

张量分解模型在现代数据科学应用中起着越来越重要的作用。一个特别感兴趣的问题是,当张量具有稀疏结构并且张量元件是非负计数数据时,拟合低级规范多核(CP)张量分解模型。 Sparten是一个高性能的C ++库,它使用不同的求解器计算低级别的分解:一阶准Newton或二阶阻尼牛顿方法,以及适当的运行时参数选择。由于Sparten中的默认参数已在使用这些方法的MATLAB实现的单个现实世界数据集的先前发布的工作中调整为实验结果,因此尚不清楚Sparten中的参数默认值是否适用于通用张量数据。此外,尚不清楚敏感算法的收敛与输入参数值的变化如何。本报告通过三个基准张量数据集进行了大规模实验,解决了这些未解决的问题。在几个不同的CPU架构上进行了实验,并用许多初始状态复制以建立算法收敛行为的一般概况。

Tensor decomposition models play an increasingly important role in modern data science applications. One problem of particular interest is fitting a low-rank Canonical Polyadic (CP) tensor decomposition model when the tensor has sparse structure and the tensor elements are nonnegative count data. SparTen is a high-performance C++ library which computes a low-rank decomposition using different solvers: a first-order quasi-Newton or a second-order damped Newton method, along with the appropriate choice of runtime parameters. Since default parameters in SparTen are tuned to experimental results in prior published work on a single real-world dataset conducted using MATLAB implementations of these methods, it remains unclear if the parameter defaults in SparTen are appropriate for general tensor data. Furthermore, it is unknown how sensitive algorithm convergence is to changes in the input parameter values. This report addresses these unresolved issues with large-scale experimentation on three benchmark tensor data sets. Experiments were conducted on several different CPU architectures and replicated with many initial states to establish generalized profiles of algorithm convergence behavior.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源