多语言多模式：对数据集，技术，挑战和机遇的分类学调查

论文标题

多语言多模式：对数据集，技术，挑战和机遇的分类学调查

Multilingual Multimodality: A Taxonomical Survey of Datasets, Techniques, Challenges and Opportunities

论文作者

Chandu, Khyathi Raghavi, Geramifard, Alborz

论文摘要

语境化语言技术超出单语语言，可以采用多种方式和语言。单独地，这些方向无疑将其扩散到几个NLP任务中。尽管有这种势头，但大多数多模式研究主要围绕英语，而多语言研究主要围绕文本模式的环境。挑战这种传统的设置，研究人员研究了多语言和多模式（多峰）流的统一。这项工作的主要目的是通过绘制任务，数据集和方法的类别来解决多数方案，来对这些作品进行编目和表征。为此，我们回顾了使用并行注释研究的语言，金或银数据，并了解这些方式和语言在建模中如何相互作用。我们介绍了建模方法及其优势和劣势的描述，以更好地了解它们可以可靠地使用的情况。此后，我们介绍了该领域整体范式的高级趋势。最后，我们通过介绍挑战和有前途的研究方向的路线图来得出结论。

Contextualizing language technologies beyond a single language kindled embracing multiple modalities and languages. Individually, each of these directions undoubtedly proliferated into several NLP tasks. Despite this momentum, most of the multimodal research is primarily centered around English and multilingual research is primarily centered around contexts from text modality. Challenging this conventional setup, researchers studied the unification of multilingual and multimodal (MultiX) streams. The main goal of this work is to catalogue and characterize these works by charting out the categories of tasks, datasets and methods to address MultiX scenarios. To this end, we review the languages studied, gold or silver data with parallel annotations, and understand how these modalities and languages interact in modeling. We present an account of the modeling approaches along with their strengths and weaknesses to better understand what scenarios they can be used reliably. Following this, we present the high-level trends in the overall paradigm of the field. Finally, we conclude by presenting a road map of challenges and promising research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题