论文标题

部分可观测时空混沌系统的无模型预测

Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data

论文作者

Rahimian, Shadi, Kerkouche, Raouf, Kurth, Ina, Fritz, Mario

论文摘要

生存分析或事件时间分析旨在模拟和预测在人群或个人中发生兴趣事件所需的时间。在医学环境下,这一事件可能是垂死,转移,癌症复发等的时代。最近,专门为生存分析设计的神经网络的使用变得越来越流行,并且是更传统方法的有吸引力的替代品。在本文中,我们利用神经网络的固有特性来结合这些模型的培训过程。这对于医疗领域至关重要,因为数据稀缺,多个健康中心的协作对于就治疗或疾病的特性做出决定性决定至关重要。为了确保数据集的隐私,通常在联合学习之上利用差异隐私。差异隐私通过将随机噪声引入不同的培训阶段而起作用,从而使对手更难提取有关数据的详细信息。但是,在小型医疗数据集和仅几个数据中心的现实设置中,这种噪音使模型更难收敛。为了解决这个问题,我们提出了DPFED-POST,为私人联邦学习计划增加了后处理阶段。此额外的步骤有助于调节嘈杂的平均参数更新的大小和模型的更容易收敛。对于我们的实验,当每个健康中心只有几百个记录时,我们在现实环境中选择3个现实世界数据集,并且与标准的差异私人联合学习方案相比,DPFED-POST成功地将模型的性能成功提高了高达$ 17 \%$。

Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17\%$ compared to the standard differentially private federated learning scheme.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源