论文标题
将银变成黄金:带有嘈杂标签的域改编,用于可穿戴心脏呼吸适应性预测
Turning Silver into Gold: Domain Adaptation with Noisy Labels for Wearable Cardio-Respiratory Fitness Prediction
论文作者
论文摘要
深度学习模型在各种医疗保健应用中都表现出了巨大的希望。但是,大多数模型都是在小型数据集上开发和验证的,因为为健康应用收集高质量(金标准)标签通常是昂贵且耗时的。结果,这些模型可能会遇到过度拟合的,并且不能很好地概括地看不见的数据。同时,通常可以使用大量带有不精确标签(银标准)的数据,因为从廉价的可穿戴设备(如加速度计和心电图传感器)中收集了很多数据。这些当前未充分利用的数据集和标签可以利用以产生更准确的临床模型。在这项工作中,我们提出了UDAMA,这是一个具有两个关键组成部分的新型模型:无监督的域适应性和多歧视器对抗训练,它利用来自源域(银色标准数据集)的嘈杂数据来改善金标准建模。我们验证了我们的框架,即使用来自两个队列研究的自由生活的可穿戴传感器数据,预测实验室测量最大氧消耗(vo $ _ {2} $ max),这是心脏振动适应性的基准指标。我们的实验表明,所提出的框架实现了Corr = 0.665 $ \ pm $ 0.04的最佳性能,为准确的适应性估算铺平了道路。
Deep learning models have shown great promise in various healthcare applications. However, most models are developed and validated on small-scale datasets, as collecting high-quality (gold-standard) labels for health applications is often costly and time-consuming. As a result, these models may suffer from overfitting and not generalize well to unseen data. At the same time, an extensive amount of data with imprecise labels (silver-standard) is starting to be generally available, as collected from inexpensive wearables like accelerometers and electrocardiography sensors. These currently underutilized datasets and labels can be leveraged to produce more accurate clinical models. In this work, we propose UDAMA, a novel model with two key components: Unsupervised Domain Adaptation and Multi-discriminator Adversarial training, which leverage noisy data from source domain (the silver-standard dataset) to improve gold-standard modeling. We validate our framework on the challenging task of predicting lab-measured maximal oxygen consumption (VO$_{2}$max), the benchmark metric of cardio-respiratory fitness, using free-living wearable sensor data from two cohort studies as inputs. Our experiments show that the proposed framework achieves the best performance of corr = 0.665 $\pm$ 0.04, paving the way for accurate fitness estimation at scale.