论文标题
多模式和可解释的互联网模因分类
Multimodal and Explainable Internet Meme Classification
论文作者
论文摘要
在当前在线平台在各种地缘政治事件和社会问题中有效武器的上下文中,互联网模因使公平的内容适度更加困难。关于模因分类和跟踪的现有工作集中在黑框方法上,这些方法没有明确考虑模因的语义或创造的上下文。在本文中,我们追求一个模块化且可解释的架构,以了解互联网模因的理解。我们设计和实施多模式分类方法,这些方法在训练案例上执行了示例和原型的推理,同时利用文本和视觉SOTA模型代表各个案例。我们研究了模块化模型在检测有害模因的两个现有任务时的相关性:仇恨言论检测和厌女症分类。我们比较了不同类别的有害性(例如,刻板印象和客观化),比较了基于示例和原型方法之间的性能以及基于示例和原型的方法,文本,视觉和多模式模型之间的性能。我们设计了一个用户友好的界面,该界面促进了我们所有模型对任何给定模因检索的示例的比较分析,并将这些可解释方法的优势和局限性告知社区。
In the current context where online platforms have been effectively weaponized in a variety of geo-political events and social issues, Internet memes make fair content moderation at scale even more difficult. Existing work on meme classification and tracking has focused on black-box methods that do not explicitly consider the semantics of the memes or the context of their creation. In this paper, we pursue a modular and explainable architecture for Internet meme understanding. We design and implement multimodal classification methods that perform example- and prototype-based reasoning over training cases, while leveraging both textual and visual SOTA models to represent the individual cases. We study the relevance of our modular and explainable models in detecting harmful memes on two existing tasks: Hate Speech Detection and Misogyny Classification. We compare the performance between example- and prototype-based methods, and between text, vision, and multimodal models, across different categories of harmfulness (e.g., stereotype and objectification). We devise a user-friendly interface that facilitates the comparative analysis of examples retrieved by all of our models for any given meme, informing the community about the strengths and limitations of these explainable methods.