在受过识别对象或场景的CNN的隐藏层中，是否有任何“对象探测器”？

论文标题

在受过识别对象或场景的CNN的隐藏层中，是否有任何“对象探测器”？

Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes?

论文作者

Gale, Ella M., Martin, Nicholas, Blything, Ryan, Nguyen, Anh, Bowers, Jeffrey S.

论文摘要

已经开发出各种测量单位选择性的方法，目的是更好地了解神经网络的工作方式。但是不同的度量提供了选择性的分歧估计，这导致了有关学习性对象表示的条件以及这些表示形式的功能相关性的不同结论。为了更好地表征对象选择性，我们对Alexnet中一组单元的各种选择性度量进行了比较，包括本地主义选择性，精度，类别的平均活动选择性（CCMAS），网络解剖，人类对激活最大化（AM）图像和标准信号检测测量指标的解释。我们发现，不同的度量提供了对象选择性的不同估计值，精度和CCMAS的措施提供了误导性高估计值。实际上，最有选择性的单元在对象分类中的命中率较差或较高的假新兵率（或两者兼而有之），使其成为较差的对象探测器。我们找不到与复发性神经网络中报道的“祖母细胞”单元一样遥不可及的单元。为了概括这些结果，我们比较了在ImageNet或ploce-365数据集中训练的VGG-16单位的选择性度量，这些措施被描述为“对象检测器”。同样，我们发现对象分类的命中率差和高弹药率。我们得出的结论是，与常见的替代方法相比，信号检测措施可以更好地评估单单元的选择性，并且图像分类的深卷积网络不会在其隐藏层中学习对象探测器。

Various methods of measuring unit selectivity have been developed with the aim of better understanding how neural networks work. But the different measures provide divergent estimates of selectivity, and this has led to different conclusions regarding the conditions in which selective object representations are learned and the functional relevance of these representations. In an attempt to better characterize object selectivity, we undertake a comparison of various selectivity measures on a large set of units in AlexNet, including localist selectivity, precision, class-conditional mean activity selectivity (CCMAS), network dissection,the human interpretation of activation maximization (AM) images, and standard signal-detection measures. We find that the different measures provide different estimates of object selectivity, with precision and CCMAS measures providing misleadingly high estimates. Indeed, the most selective units had a poor hit-rate or a high false-alarm rate (or both) in object classification, making them poor object detectors. We fail to find any units that are even remotely as selective as the 'grandmother cell' units reported in recurrent neural networks. In order to generalize these results, we compared selectivity measures on units in VGG-16 and GoogLeNet trained on the ImageNet or Places-365 datasets that have been described as 'object detectors'. Again, we find poor hit-rates and high false-alarm rates for object classification. We conclude that signal-detection measures provide a better assessment of single-unit selectivity compared to common alternative approaches, and that deep convolutional networks of image classification do not learn object detectors in their hidden layers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题