通过模块化和组成来修补弱卷积神经网络模型

论文标题

通过模块化和组成来修补弱卷积神经网络模型

Patching Weak Convolutional Neural Network Models through Modularization and Composition

论文作者

Qi, Binhang, Sun, Hailong, Gao, Xiang, Zhang, Hongyu

论文摘要

尽管在许多应用中取得了巨大的成功，但深度神经网络在实践中并不总是强大的。例如，用于分类任务的卷积神经元网络（CNN）模型通常在对某些特定类别的对象分类时表现不佳。在这项工作中，我们关注的是修补CNN模型的弱部分，而不是通过整个模型的昂贵重新培训来改进它。受软件工程中模块化和组成的基本概念的启发，我们提出了一种压缩模块化方法CNNSplitter，该方法将$ n $ class分类的强CNN模型分解为$ n $ n $ n $ n $ smill CNN模块。每个模块都是一个子模型，其中包含强模型的卷积内核的一部分。为了修补在目标类（TC）上执行不令人满意的弱CNN模型，我们将弱的CNN模型与从强CNN模型获得的相应模块组成。弱的CNN模型识别TC的能力可以通过修补来提高。此外，识别非TC的能力也得到了提高，因为将样品错误分类为TC可以正确分类为非TCS。在三个广泛使用的数据集上使用两个代表性CNN的实验结果表明，在精度和召回方面，TC的平均改进分别为12.54％和2.14％。此外，修补程序将非TCS的准确性提高了1.18％。结果表明，CNNSplitter可以通过模块化和组成来修补弱的CNN模型，从而为开发可靠的CNN模型提供了新的解决方案。

Despite great success in many applications, deep neural networks are not always robust in practice. For instance, a convolutional neuron network (CNN) model for classification tasks often performs unsatisfactorily in classifying some particular classes of objects. In this work, we are concerned with patching the weak part of a CNN model instead of improving it through the costly retraining of the entire model. Inspired by the fundamental concepts of modularization and composition in software engineering, we propose a compressed modularization approach, CNNSplitter, which decomposes a strong CNN model for $N$-class classification into $N$ smaller CNN modules. Each module is a sub-model containing a part of the convolution kernels of the strong model. To patch a weak CNN model that performs unsatisfactorily on a target class (TC), we compose the weak CNN model with the corresponding module obtained from a strong CNN model. The ability of the weak CNN model to recognize the TC can thus be improved through patching. Moreover, the ability to recognize non-TCs is also improved, as the samples misclassified as TC could be classified as non-TCs correctly. Experimental results with two representative CNNs on three widely-used datasets show that the averaged improvement on the TC in terms of precision and recall are 12.54% and 2.14%, respectively. Moreover, patching improves the accuracy of non-TCs by 1.18%. The results demonstrate that CNNSplitter can patch a weak CNN model through modularization and composition, thus providing a new solution for developing robust CNN models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题