论文标题
指向补丁:启用自我注意力进行3D形状识别
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
论文作者
论文摘要
尽管变压器体系结构在机器学习领域已变得无处不在,但其对3D形状识别的适应性并非平凡。由于其二次计算复杂性,随着输入点的增长,自我发挥的操作员迅速迅速效率低下。此外,我们发现注意力机制在全球范围内的各个点之间找到有用的联系。为了减轻这些问题,我们提出了一个两阶段的转换器(点-TNT)方法,该方法结合了局部和全球注意机制,使各个点和点斑点都可以有效地参加。对形状分类的实验表明,这种方法为下游任务提供了比基线变压器更有用的功能,同时也更有效地计算。此外,我们还扩展了我们的方法,以配合场景重建的匹配,表明它可以与现有场景重建管道结合使用。
While the Transformer architecture has become ubiquitous in the machine learning field, its adaptation to 3D shape recognition is non-trivial. Due to its quadratic computational complexity, the self-attention operator quickly becomes inefficient as the set of input points grows larger. Furthermore, we find that the attention mechanism struggles to find useful connections between individual points on a global scale. In order to alleviate these problems, we propose a two-stage Point Transformer-in-Transformer (Point-TnT) approach which combines local and global attention mechanisms, enabling both individual points and patches of points to attend to each other effectively. Experiments on shape classification show that such an approach provides more useful features for downstream tasks than the baseline Transformer, while also being more computationally efficient. In addition, we also extend our method to feature matching for scene reconstruction, showing that it can be used in conjunction with existing scene reconstruction pipelines.