论文标题

环形PCA通过密度脊

Toroidal PCA via density ridges

论文作者

García-Portugués, Eduardo, Prieto-Tirado, Arturo

论文摘要

主成分分析(PCA)是一种专为欧几里得数据设计的众所周知的线性还原技术。但是,在广泛的应用领域中,通常观察到多变量圆形数据(也称为环形数据),这是由于其支持的周期性而使PCA在其上的使用。本文介绍了Toroidal Ridge PCA(TR-PCA),这是一种用于双变量圆形数据的新型PCA结构,该数据利用密度脊作为灵活的第一个主要成分类似物的概念。两个参考双变量圆形分布,即双变量正弦von mises和双变量包裹的cauchy,被用作TR-PCA的参数分布基础。提出了有效的算法以计算这两个分布模型的密度脊。在Companion R Package Ridgetorus中引入并实现了适合环形数据(包括分数,方差分解和分辨率的分数)的完整PCA方法。通过一项新的案例研究展示了TR-PCA的有用性,该案例研究涉及圣塔芭芭拉海岸的洋流分析。

Principal Component Analysis (PCA) is a well-known linear dimension-reduction technique designed for Euclidean data. In a wide spectrum of applied fields, however, it is common to observe multivariate circular data (also known as toroidal data), rendering spurious the use of PCA on it due to the periodicity of its support. This paper introduces Toroidal Ridge PCA (TR-PCA), a novel construction of PCA for bivariate circular data that leverages the concept of density ridges as a flexible first principal component analog. Two reference bivariate circular distributions, the bivariate sine von Mises and the bivariate wrapped Cauchy, are employed as the parametric distributional basis of TR-PCA. Efficient algorithms are presented to compute density ridges for these two distribution models. A complete PCA methodology adapted to toroidal data (including scores, variance decomposition, and resolution of edge cases) is introduced and implemented in the companion R package ridgetorus. The usefulness of TR-PCA is showcased with a novel case study involving the analysis of ocean currents on the coast of Santa Barbara.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源