用标记的投影词典对学习中文手写数字的分类

论文标题

用标记的投影词典对学习中文手写数字的分类

Classification of Chinese Handwritten Numbers with Labeled Projective Dictionary Pair Learning

论文作者

Ameri, Rasool, Alameer, Ali, Ferdowsi, Saideh, Nazarpour, Kianoush, Abolghasemi, Vahid

论文摘要

字典学习是图像分类的基石。我们着手解决使用字典学习进行分类的长期挑战；那就是同时最大程度地提高学识渊博的词典的可区分性和稀疏分配能力。在此前提下，我们设计了包含三个因素的特定班级词典：可区分性，稀疏性和分类错误。我们将这些指标集成到统一的成本函数中，并采用了一个新的特征空间，即定向梯度（HOG）的直方图，以生成字典原子。使用猪设计字典的理由是它们在描述拥挤的图像细节方面的优势。与最先进的深度学习技术（即Squeeezenet，Googlenet和MobilenetV2）相比，将所提出的方法应用于中国手写数字分类的结果显示出增强的分类性能$（\ sim98 \％）$，但具有一小部分参数。此外，与仅使用像素域数据的情况相比，HOG功能与字典学习的组合可以提高准确性$ 11 \％$。当提出的方法应用于阿拉伯语和英语手写号码数据库时，支持了这些结果。

Dictionary learning is a cornerstone of image classification. We set out to address a longstanding challenge in using dictionary learning for classification; that is to simultaneously maximise the discriminability and sparse-representability power of the learned dictionaries. Upon this premise, we designed class-specific dictionaries incorporating three factors: discriminability, sparsity and classification error. We integrated these metrics into a unified cost function and adopted a new feature space, i.e., histogram of oriented gradients (HOG), to generate the dictionary atoms. The rationale of using HOG features for designing the dictionaries is their strength in describing fine details of crowded images. The results of applying the proposed method in the classification of Chinese handwritten numbers demonstrated enhanced classification performance $(\sim98\%)$ compared to state-of-the-art deep learning techniques (i.e., SqueezeNet, GoogLeNet and MobileNetV2), but with a fraction of parameters. Furthermore, combination of the HOG features with dictionary learning enhances the accuracy by $11\%$ compared to the case where only pixel domain data are used. These results were supported when the proposed method was applied to both Arabic and English handwritten number databases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题