论文标题
部分可观测时空混沌系统的无模型预测
EffMulti: Efficiently Modeling Complex Multimodal Interactions for Emotion Analysis
论文作者
论文摘要
人类擅长从多模式信号(包括口语,同时的言语和面部表情)中阅读对话者的情感。从多模式信号的复杂相互作用中有效地解码情绪仍然是一个挑战。在本文中,我们设计了三种多模式潜在表示,以完善情绪分析过程并从不同观点中捕获复杂的多模式相互作用,包括完整的三模式整合表示,模态共享表示和三个模态 - 个性化表示。然后,提出了一种模态语义层次融合,以合理地将这些表示形式纳入全面的相互作用表示中。实验结果表明,我们的effmulti优于最先进的方法。令人信服的性能从其精心设计的框架中受益,易于实施,较低的计算复杂性和较低的训练参数。
Humans are skilled in reading the interlocutor's emotion from multimodal signals, including spoken words, simultaneous speech, and facial expressions. It is still a challenge to effectively decode emotions from the complex interactions of multimodal signals. In this paper, we design three kinds of multimodal latent representations to refine the emotion analysis process and capture complex multimodal interactions from different views, including a intact three-modal integrating representation, a modality-shared representation, and three modality-individual representations. Then, a modality-semantic hierarchical fusion is proposed to reasonably incorporate these representations into a comprehensive interaction representation. The experimental results demonstrate that our EffMulti outperforms the state-of-the-art methods. The compelling performance benefits from its well-designed framework with ease of implementation, lower computing complexity, and less trainable parameters.