增加不良药物事件在社交媒体上提取鲁棒性：否定和猜测的案例研究

论文标题

增加不良药物事件在社交媒体上提取鲁棒性：否定和猜测的案例研究

Increasing Adverse Drug Events extraction robustness on social media: case study on negation and speculation

论文作者

Scaboro, Simone, Portelli, Beatrice, Chersoni, Emmanuele, Santus, Enrico, Serra, Giuseppe

论文摘要

在过去的十年中，越来越多的用户开始在社交媒体平台，博客和健康论坛上报告不良药物事件（ADE）。鉴于大量报告，药物宣传的重点是使用自然语言处理（NLP）技术快速检查这些大量文本的方法，检测到与药物相关的不良反应提及触发医学调查。但是，尽管对任务和NLP的进步越来越感兴趣，但面对语言现象（例如否定和猜测），这些模型的鲁棒性是一个开放的研究问题。否定和猜测是自然语言的普遍现象，可以严重阻碍自动化系统区分文本中事实和非事实陈述的能力。在本文中，我们考虑了在社交媒体文本上进行ADE检测的四个最新系统。我们介绍了Snax，这是一种基准，用于测试其性能，以针对包含被否定和推测的ADE的样品，显示它们针对这些现象的脆弱性。然后，我们引入了两种可能提高这些模型的鲁棒性的可能策略，表明它们俩都带来了大幅提高性能，从而将模型预测的伪造实体数量降低了60％以否定为否定，而猜测为80％。

In the last decade, an increasing number of users have started reporting Adverse Drug Events (ADE) on social media platforms, blogs, and health forums. Given the large volume of reports, pharmacovigilance has focused on ways to use Natural Language Processing (NLP) techniques to rapidly examine these large collections of text, detecting mentions of drug-related adverse reactions to trigger medical investigations. However, despite the growing interest in the task and the advances in NLP, the robustness of these models in face of linguistic phenomena such as negations and speculations is an open research question. Negations and speculations are pervasive phenomena in natural language, and can severely hamper the ability of an automated system to discriminate between factual and nonfactual statements in text. In this paper we take into consideration four state-of-the-art systems for ADE detection on social media texts. We introduce SNAX, a benchmark to test their performance against samples containing negated and speculated ADEs, showing their fragility against these phenomena. We then introduce two possible strategies to increase the robustness of these models, showing that both of them bring significant increases in performance, lowering the number of spurious entities predicted by the models by 60% for negation and 80% for speculations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题