论文标题

以数据为中心的流行预测:调查

Data-Centric Epidemic Forecasting: A Survey

论文作者

Rodríguez, Alexander, Kamarthi, Harshavardhan, Agarwal, Pulak, Ho, Javen, Patel, Mira, Sapre, Suchet, Prakash, B. Aditya

论文摘要

Covid-19的大流行提出了对多个领域的决策者的流行预测的重要性,从公共卫生到整个经济。虽然预测流行进展经常被概念化为类似于天气预报,但是它具有一些关键的差异,并且仍然是一项非平凡的任务。疾病的传播受到跨越人类行为,病原体动态,天气和环境状况的多种混杂因素的影响。由于政府公共卫生和资助机构的倡议,捕获以前无法观察到的方面的丰富数据来源的可用性增加了研究的兴趣。这尤其是在“以数据为中心”解决方案上的一系列工作中导致的,这些解决方案通过利用非传统数据源以及AI和机器学习的最新创新来增强我们的预测能力的潜力。这项调查研究了各种数据驱动的方法论和实践进步,并介绍了一个概念框架以导航它们。首先,我们列举了与流行病预测相关的大量流行病学数据集和新的数据流,捕获了各种因素,例如有症状的在线调查,零售和商业,流动性,基因组学数据等。接下来,我们将讨论针对最新数据驱动的统计和深度学习方法的方法和建模范式,以及将机械模型的领域知识与统计方法的有效性和灵活性相结合的新型混合模型类别。我们还讨论了这些预测系统的实际部署时会出现的经验和挑战,包括预测的决策。最后,我们重点介绍了整个预测管道中发现的一些挑战和开放问题。

The COVID-19 pandemic has brought forth the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy as a whole. While forecasting epidemic progression is frequently conceptualized as being analogous to weather forecasting, however it has some key differences and remains a non-trivial task. The spread of diseases is subject to multiple confounding factors spanning human behavior, pathogen dynamics, weather and environmental conditions. Research interest has been fueled by the increased availability of rich data sources capturing previously unobservable facets and also due to initiatives from government public health and funding agencies. This has resulted, in particular, in a spate of work on 'data-centered' solutions which have shown potential in enhancing our forecasting capabilities by leveraging non-traditional data sources as well as recent innovations in AI and machine learning. This survey delves into various data-driven methodological and practical advancements and introduces a conceptual framework to navigate through them. First, we enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting, capturing various factors like symptomatic online surveys, retail and commerce, mobility, genomics data and more. Next, we discuss methods and modeling paradigms focusing on the recent data-driven statistical and deep-learning based methods as well as on the novel class of hybrid models that combine domain knowledge of mechanistic models with the effectiveness and flexibility of statistical approaches. We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源