论文标题

Claclab在Socialdisner:使用医学杂志命名为西班牙推文中提及的疾病识别

CLaCLab at SocialDisNER: Using Medical Gazetteers for Named-Entity Recognition of Disease Mentions in Spanish Tweets

论文作者

Verma, Harsh, Bagherzadeh, Parsa, Bergler, Sabine

论文摘要

本文总结了SMM4H 2022任务10的CLAC提交,该提交涉及西班牙推文中提到的疾病的识别。在对每个令牌进行分类之前,我们使用多语言Roberta大型,UMLS Gazetteer和Distemist Gazetteer等功能与变压器编码器编码每个令牌编码器。我们获得0.869的严格F1得分,竞争平均值为0.675,标准偏差为0.245,中值为0.761。

This paper summarizes the CLaC submission for SMM4H 2022 Task 10 which concerns the recognition of diseases mentioned in Spanish tweets. Before classifying each token, we encode each token with a transformer encoder using features from Multilingual RoBERTa Large, UMLS gazetteer, and DISTEMIST gazetteer, among others. We obtain a strict F1 score of 0.869, with competition mean of 0.675, standard deviation of 0.245, and median of 0.761.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源