WILDQA：野外视频问题回答

论文标题

WILDQA：野外视频问题回答

WildQA: In-the-Wild Video Question Answering

论文作者

Castro, Santiago, Deng, Naihao, Huang, Pingxuan, Burzo, Mihai, Mihalcea, Rada

论文摘要

现有的视频理解数据集主要集中在人类互动上，而对“在野外”设置中的视频被录制的“野外”设置很少。我们提出了Wildqa，这是一个视频理解外部设置中录制的视频的数据集。除了视频问题回答（视频质量质量请访问）外，我们还介绍了确定给定问题和答案的视觉支持的新任务（视频证据选择）。通过使用广泛的基线模型进行评估，我们表明Wildqa对愿景和语言研究社区构成了新的挑战。该数据集可在https://lit.eecs.umich.edu/wildqa/上找到。

Existing video understanding datasets mostly focus on human interactions, with little attention being paid to the "in the wild" settings, where the videos are recorded outdoors. We propose WILDQA, a video understanding dataset of videos recorded in outside settings. In addition to video question answering (Video QA), we also introduce the new task of identifying visual support for a given question and answer (Video Evidence Selection). Through evaluations using a wide range of baseline models, we show that WILDQA poses new challenges to the vision and language research communities. The dataset is available at https://lit.eecs.umich.edu/wildqa/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题