论文标题
在切成薄片的空间中编码图像分类的不变性,并使用有限的培训数据
Invariance encoding in sliced-Wasserstein space for image classification with limited training data
论文作者
论文摘要
深度卷积神经网络(CNN)被广泛认为是最新的通用端到端图像分类系统。但是,当培训数据受到限制时,他们的表现不佳,因此需要数据增强策略,以使该方法在计算上昂贵而不总是有效。与其使用数据增强策略来编码机器学习中通常所做的不传真,不如在数学上通过利用radon累积分布变换(R-CDT)的某些数学特性(R-CDT)的某些数学特性来数学上增强最接近的子空间分类模型,该模型最近引入了图像变换。我们证明,对于特定类型的学习问题,我们的数学解决方案在分类准确性和计算复杂性方面具有比数据增强的优点,并且在有限的培训数据设置下特别有效。该方法是简单,有效,计算上有效的,非著作的,并且不需要调整参数。实施我们方法的Python代码可在https://github.com/rohdelab/mathematical_augmentation上获得。我们的方法集成为软件包Pytranskit的一部分,该软件包可以在https://github.com/rohdelab/pytranskit上找到。
Deep convolutional neural networks (CNNs) are broadly considered to be state-of-the-art generic end-to-end image classification systems. However, they are known to underperform when training data are limited and thus require data augmentation strategies that render the method computationally expensive and not always effective. Rather than using a data augmentation strategy to encode invariances as typically done in machine learning, here we propose to mathematically augment a nearest subspace classification model in sliced-Wasserstein space by exploiting certain mathematical properties of the Radon Cumulative Distribution Transform (R-CDT), a recently introduced image transform. We demonstrate that for a particular type of learning problem, our mathematical solution has advantages over data augmentation with deep CNNs in terms of classification accuracy and computational complexity, and is particularly effective under a limited training data setting. The method is simple, effective, computationally efficient, non-iterative, and requires no parameters to be tuned. Python code implementing our method is available at https://github.com/rohdelab/mathematical_augmentation. Our method is integrated as a part of the software package PyTransKit, which is available at https://github.com/rohdelab/PyTransKit.