论文标题

危机:危机分类和上下文危机嵌入的强大变压器

CrisisBERT: a Robust Transformer for Crisis Classification and Contextual Crisis Embedding

论文作者

Liu, Junhua, Singhal, Trisha, Blessing, Lucienne T. M., Wood, Kristin L., Lim, Kwan Hui

论文摘要

对危机事件的分类,例如自然灾害,恐怖袭击和大流行,是建立早期信号并为相关方告知自发行动以减少整体损害的至关重要的任务。尽管可以通过专业机构来预测诸如自然灾害之类的危机,但某些事件首先由平民发出信号,例如最近的Covid-19-19-pandemics。 Twitter之类的社交媒体平台通常通过每天发布的十亿条推文来通过大量信息交换来暴露此类危机的第一手信号。先前的工作提出了使用常规机器学习和神经网络模型的各种危机嵌入和分类。但是,没有任何作品使用基于基于注意力的深度神经网络模型(例如变形金刚和文档级上下文嵌入)进行危机嵌入和分类。这项工作提出了危机,这是一种基于端到端的变压器的模型,用于两项危机分类任务,即危机检测和危机认可,这在精度和F1分数之间显示出令人鼓舞的结果。提出的模型还表明了优于基准测试的鲁棒性,因为它显示出边缘性能妥协,同时从6个事件延伸到36个事件,只有51.4%的额外数据点。我们还提出了危机2VEC,这是一种基于注意力的文档级上下文嵌入危机嵌入的架构,它比Word2Vec和Glove等传统危机嵌入方法获得了更好的性能。据我们所知,我们的作品首先提议使用基于变压器的危机分类和文档级别的上下文危机嵌入文献中。

Classification of crisis events, such as natural disasters, terrorist attacks and pandemics, is a crucial task to create early signals and inform relevant parties for spontaneous actions to reduce overall damage. Despite crisis such as natural disasters can be predicted by professional institutions, certain events are first signaled by civilians, such as the recent COVID-19 pandemics. Social media platforms such as Twitter often exposes firsthand signals on such crises through high volume information exchange over half a billion tweets posted daily. Prior works proposed various crisis embeddings and classification using conventional Machine Learning and Neural Network models. However, none of the works perform crisis embedding and classification using state of the art attention-based deep neural networks models, such as Transformers and document-level contextual embeddings. This work proposes CrisisBERT, an end-to-end transformer-based model for two crisis classification tasks, namely crisis detection and crisis recognition, which shows promising results across accuracy and f1 scores. The proposed model also demonstrates superior robustness over benchmark, as it shows marginal performance compromise while extending from 6 to 36 events with only 51.4% additional data points. We also proposed Crisis2Vec, an attention-based, document-level contextual embedding architecture for crisis embedding, which achieve better performance than conventional crisis embedding methods such as Word2Vec and GloVe. To the best of our knowledge, our works are first to propose using transformer-based crisis classification and document-level contextual crisis embedding in the literature.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源