论文标题

预测Twitter对话线程的仇恨强度

Predicting Hate Intensity of Twitter Conversation Threads

论文作者

Meng, Qing, Suresh, Tharun, Lee, Roy Ka-Wei, Chakraborty, Tanmoy

论文摘要

推文是在线社交媒体中最简洁的交流形式,其中一条推文有可能制作或打破对话的话语。在线仇恨言论比以往任何时候都更容易访问,并且扼杀其传播对于社交媒体公司和用户进行友好沟通至关重要。除了最近的推文,无论导致这一点的推文线程/上下文如何,大多数研究的大部分研究都集中在对单个推文进行分类。遏制仇恨言论的古典方法之一是在仇恨言论邮寄后采用反应性策略。事实上的事实策略导致忽略了微妙的帖子,这些帖子并未显示出自己激发仇恨言论的潜力,但可能会在随后在帖子的答复中随后的讨论中进行预言。 In this paper, we propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future.它使用推文线程的语义和传播结构来最大化导致每个随后的推文的仇恨强度的上下文信息。我们探索了三个公开可用的Twitter数据集 - 反种族主义包含有关社交媒体论述在美国政治和共同199背景期间关于种族主义言论的回答推文;反社会介绍了在19日大流行中对反社会行为的大流行中的4000万条推文的数据集;反亚洲的介绍了基于19日大流行期间反亚洲行为的Twitter数据集。所有策划的数据集都包含Tweet线程的结构图信息。我们表明,Dragnet ++的表现大大优于所有最先进的基线。它比人相关系数的最佳基线比最佳基线利润率降低了11%,而反种族主义数据集的RMSE降低了25%,而其他两个数据集则具有相似的性能。

Tweets are the most concise form of communication in online social media, wherein a single tweet has the potential to make or break the discourse of the conversation. Online hate speech is more accessible than ever, and stifling its propagation is of utmost importance for social media companies and users for congenial communication. Most of the research barring a recent few has focused on classifying an individual tweet regardless of the tweet thread/context leading up to that point. One of the classical approaches to curb hate speech is to adopt a reactive strategy after the hate speech postage. The ex-post facto strategy results in neglecting subtle posts that do not show the potential to instigate hate speech on their own but may portend in the subsequent discussion ensuing in the post's replies. In this paper, we propose DRAGNET++, which aims to predict the intensity of hatred that a tweet can bring in through its reply chain in the future. It uses the semantic and propagating structure of the tweet threads to maximize the contextual information leading up to and the fall of hate intensity at each subsequent tweet. We explore three publicly available Twitter datasets -- Anti-Racism contains the reply tweets of a collection of social media discourse on racist remarks during US political and Covid-19 background; Anti-Social presents a dataset of 40 million tweets amidst the COVID-19 pandemic on anti-social behaviours; and Anti-Asian presents Twitter datasets collated based on anti-Asian behaviours during COVID-19 pandemic. All the curated datasets consist of structural graph information of the Tweet threads. We show that DRAGNET++ outperforms all the state-of-the-art baselines significantly. It beats the best baseline by an 11% margin on the Person correlation coefficient and a decrease of 25% on RMSE for the Anti-Racism dataset with a similar performance on the other two datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源