用胶囊学习：调查

论文标题

用胶囊学习：调查

Learning with Capsules: A Survey

论文作者

Ribeiro, Fabio De Sousa, Duarte, Kevin, Everett, Miles, Leontidis, Georgios, Shah, Mubarak

论文摘要

提出了胶囊网络作为以学习为中心的对象表示的卷积神经网络（CNN）的替代方法，可以利用它来改善概括和样本复杂性。与CNN不同，胶囊网络旨在通过使用神经元组编码视觉实体并了解这些实体之间的关系来明确建模零件整体层次关系。胶囊网络取得的有希望的早期结果激励了深度学习社区继续试图提高多个应用领域的性能和可伸缩性。但是，胶囊网络研究的一个主要障碍是缺乏理解其基本思想和动机的可靠参考点。这项调查的目的是为胶囊网络研究格局提供全面的概述，该景观将成为未来社区的宝贵资源。为此，我们首先介绍胶囊网络背后的基本概念和动机，例如在计算机视觉中的等效推断。然后，我们介绍胶囊路由机制和胶囊网络的各种配方的技术进步，例如生成和几何。此外，我们还提供了一个详细的解释，说明胶囊网络如何与变压器中的流行注意力机制相关，并在表示学习的背景下强调了它们之间的非平凡概念相似之处。之后，我们探讨了胶囊网络在计算机视觉，视频和运动，图形表示，自然语言处理，医学成像等中的广泛应用。总而言之，我们提供了有关胶囊网络研究中主要障碍的深入讨论，并突出了未来工作的有希望的研究方向。

Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations, which can be leveraged for improved generalization and sample complexity. Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships by using groups of neurons to encode visual entities, and learn the relationships between those entities. Promising early results achieved by capsule networks have motivated the deep learning community to continue trying to improve their performance and scalability across several application areas. However, a major hurdle for capsule network research has been the lack of a reliable point of reference for understanding their foundational ideas and motivations. The aim of this survey is to provide a comprehensive overview of the capsule network research landscape, which will serve as a valuable resource for the community going forward. To that end, we start with an introduction to the fundamental concepts and motivations behind capsule networks, such as equivariant inference in computer vision. We then cover the technical advances in the capsule routing mechanisms and the various formulations of capsule networks, e.g. generative and geometric. Additionally, we provide a detailed explanation of how capsule networks relate to the popular attention mechanism in Transformers, and highlight non-trivial conceptual similarities between them in the context of representation learning. Afterwards, we explore the extensive applications of capsule networks in computer vision, video and motion, graph representation learning, natural language processing, medical imaging and many others. To conclude, we provide an in-depth discussion regarding the main hurdles in capsule network research, and highlight promising research directions for future work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题