重新思考电影类型分类，并用细粒度的语义群集

论文标题

重新思考电影类型分类，并用细粒度的语义群集

Rethinking movie genre classification with fine-grained semantic clustering

论文作者

Fish, Edward, Weinbren, Jon, Gilbert, Andrew

论文摘要

电影类型分类是机器学习中的一个活跃研究领域。但是，由于可用的标签有限，电影定义中的电影之间的语义变化可能很大。我们通过在电影的多模式内容中识别“细颗粒”语义信息来扩展这些“粗糙”类型标签。通过利用预先培训的“专家”网络，我们了解了模式不同组合类型分类的不同组合的影响。使用对比损失，我们继续微调这个“粗大”类型分类网络，以识别所有类型标签中电影之间的高级互文相似性。这导致基于语义相似性，同时仍保留一些流派信息，从而导致更加“细粒度”和详细的聚类。我们的方法在新引入的多模式37,866,450帧，8,800个电影预告片数据集，MMX-Trailer-20上展示，其中包括预先计算的音频，位置，运动，运动和图像嵌入式。

Movie genre classification is an active research area in machine learning. However, due to the limited labels available, there can be large semantic variations between movies within a single genre definition. We expand these 'coarse' genre labels by identifying 'fine-grained' semantic information within the multi-modal content of movies. By leveraging pre-trained 'expert' networks, we learn the influence of different combinations of modes for multi-label genre classification. Using a contrastive loss, we continue to fine-tune this 'coarse' genre classification network to identify high-level intertextual similarities between the movies across all genre labels. This leads to a more 'fine-grained' and detailed clustering, based on semantic similarities while still retaining some genre information. Our approach is demonstrated on a newly introduced multi-modal 37,866,450 frame, 8,800 movie trailer dataset, MMX-Trailer-20, which includes pre-computed audio, location, motion, and image embeddings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题