用于图像分类的多层张量网络

论文标题

用于图像分类的多层张量网络

Multi-layered tensor networks for image classification

论文作者

Selvan, Raghavendra, Ørting, Silas, Dam, Erik B

论文摘要

最近介绍的用于监督图像分类的本地无序张量网络（Lotenet）使用矩阵产品状态（MPS）操作在变换的图像贴片的网格上。所得的补丁表示形式合并回图像空间，并使用每层多个MPS块进行层次汇总，以获得最终的决策规则。在这项工作中，我们建议对Lotenet进行非基于基础的修改，该修改每层执行一个MPS操作，而不是几个补丁级操作。输入图像中每一层的MPS块中的空间信息被挤入特征维度（类似于Lotenet）中，以最大程度地提高像素之间的空间相关性，当时图像将图像扁平化为1D载体。所提出的多层张量网络（MLTN）能够在多层设置中学习高维空间的线性决策边界，这与LoteNet相比，计算成本的降低而没有任何性能降解。

The recently introduced locally orderless tensor network (LoTeNet) for supervised image classification uses matrix product state (MPS) operations on grids of transformed image patches. The resulting patch representations are combined back together into the image space and aggregated hierarchically using multiple MPS blocks per layer to obtain the final decision rules. In this work, we propose a non-patch based modification to LoTeNet that performs one MPS operation per layer, instead of several patch-level operations. The spatial information in the input images to MPS blocks at each layer is squeezed into the feature dimension, similar to LoTeNet, to maximise retained spatial correlation between pixels when images are flattened into 1D vectors. The proposed multi-layered tensor network (MLTN) is capable of learning linear decision boundaries in high dimensional spaces in a multi-layered setting, which results in a reduction in the computation cost compared to LoTeNet without any degradation in performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题