论文标题

利用音频格式塔来预测媒体的记忆力

Leveraging Audio Gestalt to Predict Media Memorability

论文作者

Sweeney, Lorin, Healy, Graham, Smeaton, Alan F.

论文摘要

记忆性决定了什么是逃避空虚的原因,什么是蠕虫进入我们思想中最深的沟渠中的原因。当我们涉足日常数字洪流时,它是策划更有意义的媒体内容的关键。中世纪2020年预测的媒体记忆力任务旨在通过设置自动预测视频记忆性的任务来解决媒体记忆性问题。我们的方法是一种多模式深度学习的晚期融合,结合了视觉,语义和听觉功能。我们使用音频Gestalt来估计音频方式对整体视频记忆性的影响,因此可以告知功能的组合可以最好地预测给定的视频的记忆性分数。

Memorability determines what evanesces into emptiness, and what worms its way into the deepest furrows of our minds. It is the key to curating more meaningful media content as we wade through daily digital torrents. The Predicting Media Memorability task in MediaEval 2020 aims to address the question of media memorability by setting the task of automatically predicting video memorability. Our approach is a multimodal deep learning-based late fusion that combines visual, semantic, and auditory features. We used audio gestalt to estimate the influence of the audio modality on overall video memorability, and accordingly inform which combination of features would best predict a given video's memorability scores.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源