论文标题
使用自然语言处理,基于社交媒体评论,来自COVID-19的健康,社会心理和社会问题
Health, Psychosocial, and Social issues emanating from COVID-19 pandemic based on Social Media Comments using Natural Language Processing
论文作者
论文摘要
COVID-19大流行造成了全球健康危机,影响了人类生命的许多方面。在没有疫苗和抗病毒药的情况下,已经实施了几种行为改变和政策倡议,例如身体疏远,以控制冠状病毒的传播。社交媒体数据可以揭示公众对全球各国政府和卫生机构如何处理大流行的看法,以及疾病对人们的影响,无论其地理位置如何符合各种因素,这些因素都阻碍或促进了控制全球大流行的努力。本文旨在使用社交媒体数据调查Covid-19-19-19的大流行对全球人的影响。我们应用自然语言处理(NLP)和主题分析来了解使用社交媒体数据在COVID-19大流行方面的公众意见,经验和问题。首先,我们从Twitter,Facebook,YouTube和三个在线讨论论坛中收集了超过4700万与COVID相关的评论。其次,我们执行数据预处理,涉及应用NLP技术清洁和准备数据以进行自动主题提取。第三,我们采用上下文感知的NLP方法来提取有意义的键形或100万个随机选择的评论中的主题,以及每个主题的计算情感分数,并根据基于词典的技术得分来基于分数分配情感极性。第四,我们将相关主题分为更广泛的主题。总共出现了34个负面主题,其中15个是与健康相关的问题,社会心理问题以及与Covid-19的大流行有关的社会问题。此外,我们的结果还出现了20个积极的主题。最后,我们建议采取干预措施,以帮助基于积极的主题和其他植根于研究的补救思想来解决负面问题。
The COVID-19 pandemic has caused a global health crisis that affects many aspects of human lives. In the absence of vaccines and antivirals, several behavioural change and policy initiatives, such as physical distancing, have been implemented to control the spread of the coronavirus. Social media data can reveal public perceptions toward how governments and health agencies across the globe are handling the pandemic, as well as the impact of the disease on people regardless of their geographic locations in line with various factors that hinder or facilitate the efforts to control the spread of the pandemic globally. This paper aims to investigate the impact of the COVID-19 pandemic on people globally using social media data. We apply natural language processing (NLP) and thematic analysis to understand public opinions, experiences, and issues with respect to the COVID-19 pandemic using social media data. First, we collect over 47 million COVID-19-related comments from Twitter, Facebook, YouTube, and three online discussion forums. Second, we perform data preprocessing which involves applying NLP techniques to clean and prepare the data for automated theme extraction. Third, we apply context-aware NLP approach to extract meaningful keyphrases or themes from over 1 million randomly selected comments, as well as compute sentiment scores for each theme and assign sentiment polarity based on the scores using lexicon-based technique. Fourth, we categorize related themes into broader themes. A total of 34 negative themes emerged, out of which 15 are health-related issues, psychosocial issues, and social issues related to the COVID-19 pandemic from the public perspective. In addition, 20 positive themes emerged from our results. Finally, we recommend interventions that can help address the negative issues based on the positive themes and other remedial ideas rooted in research.