论文标题
强大的机器学习的统计设计和分析:COVID-19的案例研究
Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19
论文作者
论文摘要
自从2019年冠状病毒病(Covid-19)大流行以来,人们一直有兴趣使用人工智能方法来预测基于声音音频信号(例如咳嗽记录)的COVID-19感染状态。但是,现有的研究在数据收集和评估所提出的预测模型的性能方面存在局限性。本文使用英国卫生安全局收集的数据集,严格评估用于预测基于声音音频信号的Covid-19感染状态的最新机器学习技术。该数据集包括声学记录和广泛的研究参与者元数据。我们提供了测试基于声学特征的COVID-19感染状态的方法的指南,并讨论如何将它们更广泛地扩展到基于公共卫生数据集的预测方法的开发和评估。
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.