论文标题
衡量人类的看法以提高开放式识别
Measuring Human Perception to Improve Open Set Recognition
论文作者
论文摘要
人类识别对象何时属于或不属于特定视觉任务的能力优于所有开放式识别算法。通过心理学的视觉心理物理学的方法和过程来衡量的人类感知为需要管理新颖性的算法提供了额外的数据流。例如,人类受试者的测量反应时间可以提供有关是否容易与其他类别(已知或新颖的类别混淆)的洞察力。在这项工作中,我们设计并进行了一个大规模的行为实验,该实验收集了超过200,000种与物体识别相关的人类反应时间测量。收集的数据指示的反应时间在样本级别的对象之间有意义地变化。因此,我们设计了一种新的心理物理损失函数,该功能在深网络中与人类行为保持一致性,该函数在不同图像中显示出可变的反应时间。与生物学愿景一样,这种方法使我们能够在标记有限的培训数据的制度中实现良好的开放式识别性能。通过使用来自ImageNet的数据的实验,当训练具有这种新配方的多尺度登记材料时,可以观察到显着改善:它将TOP-1验证精度显着提高了6.02%,对已知样品的TOP-1测试精度提高了9.81%,而对未知样品的TOP-1测试准确性提高了33.18%。我们将我们的方法与文献中的10种开放式识别方法进行了比较,这些方法在多个指标上都表现出色。
The human ability to recognize when an object belongs or does not belong to a particular vision task outperforms all open set recognition algorithms. Human perception as measured by the methods and procedures of visual psychophysics from psychology provides an additional data stream for algorithms that need to manage novelty. For instance, measured reaction time from human subjects can offer insight as to whether a class sample is prone to be confused with a different class -- known or novel. In this work, we designed and performed a large-scale behavioral experiment that collected over 200,000 human reaction time measurements associated with object recognition. The data collected indicated reaction time varies meaningfully across objects at the sample-level. We therefore designed a new psychophysical loss function that enforces consistency with human behavior in deep networks which exhibit variable reaction time for different images. As in biological vision, this approach allows us to achieve good open set recognition performance in regimes with limited labeled training data. Through experiments using data from ImageNet, significant improvement is observed when training Multi-Scale DenseNets with this new formulation: it significantly improved top-1 validation accuracy by 6.02%, top-1 test accuracy on known samples by 9.81%, and top-1 test accuracy on unknown samples by 33.18%. We compared our method to 10 open set recognition methods from the literature, which were all outperformed on multiple metrics.