论文标题

通过在多核CPU上有效并行化加速Barnes-HUT T-SNE算法

Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs

论文作者

Chaudhary, Narendra, Pivovar, Alexander, Yakovlev, Pavel, Gorshkov, Andrey, Misra, Sanchit

论文摘要

T-SNE仍然是可视化高维数据的最受欢迎的嵌入技术之一。大多数T-SNE的标准包装(例如Scikit-learn)使用Barnes-Hut T-SNE(BH T-SNE)算法用于大型数据集。但是,该算法的现有CPU实现效率低下。在这项工作中,我们通过缓存优化,SIMD,并行化顺序步骤和改进多线程步骤的并行化加速了CPU上的BH T-SNE。我们的实施(ACC-T-SNE)比Scikit-Learn和最先进的BH T-SNE实现的速度高达261倍,并且在32核Intel(R)Icelake Cloud实例上分别是最先进的BH T-SNE实现。

t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源