使用来自中国，意大利，日本的跨国数据，联合的半监督学习胸部CT的共同区域分割

论文标题

使用来自中国，意大利，日本的跨国数据，联合的半监督学习胸部CT的共同区域分割

Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan

论文作者

Yang, Dong, Xu, Ziyue, Li, Wenqi, Myronenko, Andriy, Roth, Holger R., Harmon, Stephanie, Xu, Sheng, Turkbey, Baris, Turkbey, Evrim, Wang, Xiaosong, Zhu, Wentao, Carrafiello, Gianpaolo, Patella, Francesca, Cariati, Maurizio, Obinata, Hirofumi, Mori, Hitoshi, Tamura, Kaku, An, Peng, Wood, Bradford J., Xu, Daguang

论文摘要

Covid-19的最近爆发导致了对SARS-COV-2感染的可靠诊断和管理的迫切需求。作为一种免费工具，胸部CT已被证明能够揭示Covid-19的视觉模式，该模式在疾病过程中的多个阶段具有明确的价值。为了促进CT分析，最近的努力集中在计算机辅助表征和诊断上，这表现出了令人鼓舞的结果。但是，在部署基于学习的模型时，跨临床数据中心数据的域变化构成了一个严重的挑战。在这项工作中，我们试图通过联邦和半监督学习来找到解决这一挑战的解决方案。当通过一个数据集训练模型并将其应用于另一个数据集时，采用了由三个国家的1704次扫描组成的跨国数据库研究性能差距。专家放射科医师手动划定了945次扫描，以进行共同研究结果。在处理数据和注释的变异性时，提出了一种新型联合半监督学习技术，以充分利用所有可用数据（有或没有注释）。联合学习避免了对敏感数据共享的需求，这使其对具有严格的数据隐私监管政策的机构和国家有利。此外，半义务有可能减轻分布式环境下的注释负担。与完全监督的方案相比，提出的框架被证明是有效的，并具有传统的数据共享而不是模型重量共享。

The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题