超过50,000个分类的大规模图像识别方法

论文标题

超过50,000个分类的大规模图像识别方法

Approaches of large-scale images recognition with more than 50,000 categoris

论文作者

Huang, Wanhong, Geng, Rui

论文摘要

尽管当前的简历模型能够在具有数百或数千个类别的小规模图像分类数据集上实现高度准确性，但是对于具有超过50,000个类别的大型数据集，许多模型在计算或太空消耗中变得不可行。在本文中，我们提供了一种可行的解决方案，用于使用传统的CV技术（例如Features提取和处理），BOVW（视觉单词袋）和一些统计学习技术（例如Mini Batch K-Means，SVM，SVM），用于对大型物种数据集进行分类。然后与神经网络模型混合。应用这些技术时，我们对时间和内存消耗进行了一些优化，因此对于大规模数据集来说，它是可行的。而且，我们还使用一些技术来减少标签数据的影响。我们使用一个具有超过50 000个类别的数据集，所有操作均在使用L 6GB RAM的普通计算机上完成，CPU为3。oghz。我们的贡献是：1）分析在培训过程中可能遇到哪些问题，并提出了几种可行的方法来解决这些问题。 2）使传统的简历模型与神经网络模型相结合，为在时间和空间资源的限制内培训大规模分类数据集提供了一些可行的方案。

Though current CV models have been able to achieve high levels of accuracy on small-scale images classification dataset with hundreds or thousands of categories, many models become infeasible in computational or space consumption when it comes to large-scale dataset with more than 50,000 categories. In this paper, we provide a viable solution for classifying large-scale species datasets using traditional CV techniques such as.features extraction and processing, BOVW(Bag of Visual Words) and some statistical learning technics like Mini-Batch K-Means,SVM which are used in our works. And then mixed with a neural network model. When applying these techniques, we have done some optimization in time and memory consumption, so that it can be feasible for large-scale dataset. And we also use some technics to reduce the impact of mislabeling data. We use a dataset with more than 50, 000 categories, and all operations are done on common computer with l 6GB RAM and a CPU of 3. OGHz. Our contributions are: 1) analysis what problems may meet in the training processes, and presents several feasible ways to solve these problems. 2) Make traditional CV models combined with neural network models provide some feasible scenarios for training large-scale classified datasets within the constraints of time and spatial resources.

下载PDF全文

下载文献需遵守相关版权规定

论文标题