论文标题

部分可观测时空混沌系统的无模型预测

Improving Health Mentioning Classification of Tweets using Contrastive Adversarial Training

论文作者

Khan, Pervaiz Iqbal, Siddiqui, Shoaib Ahmed, Razzak, Imran, Dengel, Andreas, Ahmed, Sheraz

论文摘要

健康提及分类(HMC)将输入文本分类为健康。对疾病单词的象征性和非健康提及使分类任务具有挑战性。学习输入文本的上下文是此问题的关键。这个想法是通过其周围的单词来学习单词表示形式,并在文本中利用表情符号来帮助改善分类结果。在本文中,我们使用对抗性训练在模型进行微调过程中用作正规化器的对抗训练来改善输入文本的单词表示形式。我们通过扰动模型的嵌入来生成对抗示例,然后在一对干净和对抗性示例上训练模型。此外,我们利用对比度损失,将一对清洁和扰动的例子彼此接近,而在表示空间中的其他示例则越来越多。我们在公开可用的PHM2017数据集的扩展版本上训练和评估该方法。实验表明,比伯特 - 大基线的提高1.0%,比罗伯塔大基线的基线比0.6%,而在F1得分方面,最先进的是5.8%。此外,我们通过利用可解释的AI的力量来简要分析结果。

Health mentioning classification (HMC) classifies an input text as health mention or not. Figurative and non-health mention of disease words makes the classification task challenging. Learning the context of the input text is the key to this problem. The idea is to learn word representation by its surrounding words and utilize emojis in the text to help improve the classification results. In this paper, we improve the word representation of the input text using adversarial training that acts as a regularizer during fine-tuning of the model. We generate adversarial examples by perturbing the embeddings of the model and then train the model on a pair of clean and adversarial examples. Additionally, we utilize contrastive loss that pushes a pair of clean and perturbed examples close to each other and other examples away in the representation space. We train and evaluate the method on an extended version of the publicly available PHM2017 dataset. Experiments show an improvement of 1.0% over BERT-Large baseline and 0.6% over RoBERTa-Large baseline, whereas 5.8% over the state-of-the-art in terms of F1 score. Furthermore, we provide a brief analysis of the results by utilizing the power of explainable AI.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源