论文标题
全球实例跟踪:更像人类找到目标
Global Instance Tracking: Locating Target More Like Humans
论文作者
论文摘要
目标跟踪是人类视觉系统的基本能力,已通过计算机视觉任务进行模拟。但是,现有的跟踪器在严峻的实验环境中表现良好,但在遮挡和快速运动等挑战中失败。巨大的差距表明研究仅衡量跟踪性能而不是智能。如何科学判断跟踪器的情报水平?与决策问题不同,缺乏三个要求(一项具有挑战性的任务,公平的环境和科学评估程序),因此很难回答这个问题。在本文中,我们首先提出了全局实例跟踪(GIT)任务,该任务应该在视频中搜索一个任意用户指定的实例,而无需对相机或运动一致性进行任何假设,以模拟人类的视觉跟踪能力。在此时,我们构建了高质量和大规模的基准视频记录,以创造一个具有挑战性的环境。最后,我们设计了使用人类能力作为判断跟踪情报的基准的科学评估程序。此外,我们还提供一个在线平台,其中包括工具包和更新的排行榜。尽管实验结果表明跟踪器和人类之间存在明确的差距,但我们希望向前迈出一步,以产生真实的人类式跟踪器。数据库,工具包,评估服务器和基线结果可在http://videocube.aitestunion.com上获得。
Target tracking, the essential ability of the human visual system, has been simulated by computer vision tasks. However, existing trackers perform well in austere experimental environments but fail in challenges like occlusion and fast motion. The massive gap indicates that researches only measure tracking performance rather than intelligence. How to scientifically judge the intelligence level of trackers? Distinct from decision-making problems, lacking three requirements (a challenging task, a fair environment, and a scientific evaluation procedure) makes it strenuous to answer the question. In this article, we first propose the global instance tracking (GIT) task, which is supposed to search an arbitrary user-specified instance in a video without any assumptions about camera or motion consistency, to model the human visual tracking ability. Whereafter, we construct a high-quality and large-scale benchmark VideoCube to create a challenging environment. Finally, we design a scientific evaluation procedure using human capabilities as the baseline to judge tracking intelligence. Additionally, we provide an online platform with toolkit and an updated leaderboard. Although the experimental results indicate a definite gap between trackers and humans, we expect to take a step forward to generate authentic human-like trackers. The database, toolkit, evaluation server, and baseline results are available at http://videocube.aitestunion.com.