论文标题
复杂数据集的贝叶斯低位插值分解
Bayesian Low-Rank Interpolative Decomposition for Complex Datasets
论文作者
论文摘要
在本文中,我们介绍了一种用于学习插值分解(ID)的概率模型,该模型通常用于特征选择,低级别近似值以及识别数据中的隐藏模式,其中矩阵因子是与每个数据维度相关的潜在变量。在指定子空间上具有支持的先前密度用于解决观察到的矩阵的分量分量的大小的约束。采用了基于Gibbs抽样的贝叶斯推理程序。我们在各种现实世界数据集上评估了该模型,包括CCLE EC50,CCLO IC50,CTRP EC50和MOVIELENS 100K数据集,具有不同的尺寸和尺寸,并表明与现有随机方法相比,提出的拟建贝叶斯ID GBT和GBTN模型导致较小的重建性错误。
In this paper, we introduce a probabilistic model for learning interpolative decomposition (ID), which is commonly used for feature selection, low-rank approximation, and identifying hidden patterns in data, where the matrix factors are latent variables associated with each data dimension. Prior densities with support on the specified subspace are used to address the constraint for the magnitude of the factored component of the observed matrix. Bayesian inference procedure based on Gibbs sampling is employed. We evaluate the model on a variety of real-world datasets including CCLE EC50, CCLE IC50, CTRP EC50,and MovieLens 100K datasets with different sizes, and dimensions, and show that the proposed Bayesian ID GBT and GBTN models lead to smaller reconstructive errors compared to existing randomized approaches.