论文标题

通过高维规范相关分析的信息正规化估计

Regularized Estimation of Information via High Dimensional Canonical Correlation Analysis

论文作者

Riba, Jaume, de Cabrera, Ferran

论文摘要

近年来,从通信以外的许多领域中估算出来自数据的信息,人们一直在兴趣兴趣。本文旨在通过使用合并的二阶统计工具来估计两个随机现象之间的信息。为此目的选择了平方损失的互信息,作为香农互助的自然替代物。这样做的理由是为I.I.D.开发的。离散源 - 将数据映射到单纯形空间,以及用于模拟源 - 将数据映射到特征空间上 - ,根据信息理论测量的局部近似,突出了文献中与其他知名相关概念的链接。所提出的方法可以在大型数据集上使用可解释性和可伸缩性,从而为免费的正则化参数提供物理解释。此外,所提出的映射的结构允许诉诸于Szegö的定理,以减少高维映射的复杂性,并通过光谱分析表现出很强的二元性。使用高斯混合物分析提出的估计器的性能。

In recent years, there has been an upswing of interest in estimating information from data emerging in a lot of areas beyond communications. This paper aims at estimating the information between two random phenomena by using consolidated second-order statistics tools. The squared-loss mutual information is chosen for that purpose as a natural surrogate of Shannon mutual information. The rationale for doing so is developed for i.i.d. discrete sources -mapping data onto the simplex space-, and for analog sources -mapping data onto the characteristic space-, highlighting the links with other well-known related concepts in the literature based on local approximations of information-theoretic measures. The proposed approach gains in interpretability and scalability for its use on large datasets, providing physical interpretation to the free regularization parameters. Moreover, the structure of the proposed mapping allows resorting to Szegö's theorem to reduce the complexity for high dimensional mappings, exhibiting strong dualities with spectral analysis. The performance of the proposed estimators is analyzed using Gaussian mixtures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源