分层球形CNN，具有基于起重的自适应小波，用于汇总和不变的小波

论文标题

分层球形CNN，具有基于起重的自适应小波，用于汇总和不变的小波

Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for Pooling and Unpooling

论文作者

Xu, Mingxing, Li, Chenglin, Dai, Wenrui, Chen, Siheng, Zou, Junni, Frossard, Pascal, Xiong, Hongkai

论文摘要

汇总和不加油是在构建层次球形卷积神经网络（HS-CNN）方面的两个基本操作，用于球形域中的全面特征学习。大多数现有模型都采用基于下采样的合并，这将不可避免地会导致信息损失，并且无法适应不同的球形信号和任务。此外，随后的未解决的保留信息不能很好地恢复，以表征任务所需的功能。在本文中，我们提出了一个具有提升结构的HS-CNN的新型框架，以学习适应性的球形小波，用于汇总和不加油，称为Lifths-CNN，可确保对图像和像素级任务进行更有效的层次结构学习。具体而言，自适应球形小波是通过由可训练的起重操作员（即更新和预测操作员）组成的提升结构来学习的。借助这种可学习的提升结构，我们可以分别将信号自适应地将信号划分为包含低频和高频组件的两个子带，从而生成更好的下缩放表示，以通过在低频次频带中保留更多信息来汇总汇总。更新和预测操作员以基于图的关注对共同考虑信号的特征和潜在的几何形状进行参数化。我们进一步表明，学到的小波将承诺特定的特性，从而确保空间频率定位，以更好地利用空间和频域中信号的相关性。然后，我们提出了一个基于起重的池化的不冷操作，该操作可逆转，在该操作中，通过使用学习的升降操作员恢复上量表的表示形式，可以执行逆小波变换。对各种球形领域任务的广泛经验评估验证了拟议的Lifths-CNN的优越性。

Pooling and unpooling are two essential operations in constructing hierarchical spherical convolutional neural networks (HS-CNNs) for comprehensive feature learning in the spherical domain. Most existing models employ downsampling-based pooling, which will inevitably incur information loss and cannot adapt to different spherical signals and tasks. Besides, the preserved information after pooling cannot be well restored by the subsequent unpooling to characterize the desirable features for a task. In this paper, we propose a novel framework of HS-CNNs with a lifting structure to learn adaptive spherical wavelets for pooling and unpooling, dubbed LiftHS-CNN, which ensures a more efficient hierarchical feature learning for both image- and pixel-level tasks. Specifically, adaptive spherical wavelets are learned with a lifting structure that consists of trainable lifting operators (i.e., update and predict operators). With this learnable lifting structure, we can adaptively partition a signal into two sub-bands containing low- and high-frequency components, respectively, and thus generate a better down-scaled representation for pooling by preserving more information in the low-frequency sub-band. The update and predict operators are parameterized with graph-based attention to jointly consider the signal's characteristics and the underlying geometries. We further show that particular properties are promised by the learned wavelets, ensuring the spatial-frequency localization for better exploiting the signal's correlation in both spatial and frequency domains. We then propose an unpooling operation that is invertible to the lifting-based pooling, where an inverse wavelet transform is performed by using the learned lifting operators to restore an up-scaled representation. Extensive empirical evaluations on various spherical domain tasks validate the superiority of the proposed LiftHS-CNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题