目标人类的关注

论文标题

目标人类的关注

Target-absent Human Attention

论文作者

Yang, Zhibo, Mondal, Sounak, Ahn, Seoyoung, Zelinsky, Gregory, Hoai, Minh, Samaras, Dimitris

论文摘要

人类凝视行为的预测对于构建可以预见用户注意力的人类计算机交互式系统很重要。已经开发了计算机视觉模型，以预测人们在寻找目标对象时进行的固定。但是，何时没有目标呢？同样重要的是要知道人们在找不到目标时如何搜索以及何时停止搜索。在本文中，我们提出了第一个以数据驱动的计算模型来解决搜索终止问题，并预测了搜索未出现在图像中的目标的人进行的搜索固定的扫描路径。我们将视觉搜索建模为模仿学习问题，并代表观众通过使用新颖的状态表示来获取的内部知识，我们称之为foveated特征映射（FFMS）。 FFMS将模拟的散发性视网膜集成到预处理的Convnet中，该转换会产生网络内功能金字塔，所有这些都具有最小的计算开销。我们的方法将FFMs整合为逆增强学习中的状态表示。在实验上，我们在预测可可搜索的人类目标搜索行为时改善了最新技术的状态

The prediction of human gaze behavior is important for building human-computer interactive systems that can anticipate a user's attention. Computer vision models have been developed to predict the fixations made by people as they search for target objects. But what about when the image has no target? Equally important is to know how people search when they cannot find a target, and when they would stop searching. In this paper, we propose the first data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images. We model visual search as an imitation learning problem and represent the internal knowledge that the viewer acquires through fixations using a novel state representation that we call Foveated Feature Maps (FFMs). FFMs integrate a simulated foveated retina into a pretrained ConvNet that produces an in-network feature pyramid, all with minimal computational overhead. Our method integrates FFMs as the state representation in inverse reinforcement learning. Experimentally, we improve the state of the art in predicting human target-absent search behavior on the COCO-Search18 dataset

下载PDF全文

下载文献需遵守相关版权规定

论文标题