利用上下文信息用于通用事件边界字幕

论文标题

利用上下文信息用于通用事件边界字幕

Exploiting Context Information for Generic Event Boundary Captioning

论文作者

Zhang, Jinrui, Wang, Teng, Zheng, Feng, Cheng, Ran, Luo, Ping

论文摘要

通用事件边界字幕（GEBC）旨在生成三个句子，描述给定时间边界的状态更改。以前的方法仅处理一次单个边界的信息，该信息缺乏视频上下文信息的利用。为了解决这个问题，我们设计了一个直接将整个视频作为输入的模型，并为所有边界提供了字幕。该模型可以通过对边界边界建模来了解每个时间边界的上下文信息。实验证明了上下文信息的有效性。所提出的方法在测试集上达到了72.84分数，我们在此挑战中达到了$ 2^{nd} $。我们的代码可在：\ url {https://github.com/zjr2000/context-gebc}中获得。

Generic Event Boundary Captioning (GEBC) aims to generate three sentences describing the status change for a given time boundary. Previous methods only process the information of a single boundary at a time, which lacks utilization of video context information. To tackle this issue, we design a model that directly takes the whole video as input and generates captions for all boundaries parallelly. The model could learn the context information for each time boundary by modeling the boundary-boundary interactions. Experiments demonstrate the effectiveness of context information. The proposed method achieved a 72.84 score on the test set, and we reached the $2^{nd}$ place in this challenge. Our code is available at: \url{https://github.com/zjr2000/Context-GEBC}

下载PDF全文

下载文献需遵守相关版权规定

论文标题