视频编辑的解剖学：用于AI辅助视频编辑的数据集和基准套件

论文标题

视频编辑的解剖学：用于AI辅助视频编辑的数据集和基准套件

The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing

论文作者

Argaw, Dawit Mureja, Heilbron, Fabian Caba, Lee, Joon-Young, Woodson, Markus, Kweon, In So

论文摘要

机器学习正在改变视频编辑行业。计算机视觉的最新进展已升级视频编辑任务，例如智能重新构图，旋转镜，颜色分级或应用数字化妆。但是，大多数解决方案都集中在视频操作和VFX上。这项工作介绍了视频编辑，数据集和基准的解剖结构，以促进AI辅助视频编辑的研究。我们的基准套件专注于视频编辑任务，除了视觉效果之外，例如自动录像组织和辅助视频组装。为了对这些方面进行研究，我们注释了超过150万的标签，并从196176年从电影场景中取样了与摄影相关的概念。我们为每个任务建立竞争性基线方法和详细分析。我们希望我们的作品能够对AI辅助视频编辑的未经展开的领域进行创新的研究。

Machine learning is transforming the video editing industry. Recent advances in computer vision have leveled-up video editing tasks such as intelligent reframing, rotoscoping, color grading, or applying digital makeups. However, most of the solutions have focused on video manipulation and VFX. This work introduces the Anatomy of Video Editing, a dataset, and benchmark, to foster research in AI-assisted video editing. Our benchmark suite focuses on video editing tasks, beyond visual effects, such as automatic footage organization and assisted video assembling. To enable research on these fronts, we annotate more than 1.5M tags, with relevant concepts to cinematography, from 196176 shots sampled from movie scenes. We establish competitive baseline methods and detailed analyses for each of the tasks. We hope our work sparks innovative research towards underexplored areas of AI-assisted video editing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题