可扩展的在线疾病诊断通过多模型融合的演员批判性增强学习

论文标题

可扩展的在线疾病诊断通过多模型融合的演员批判性增强学习

Scalable Online Disease Diagnosis via Multi-Model-Fused Actor-Critic Reinforcement Learning

论文作者

He, Weijie, Chen, Ting

论文摘要

对于那些在线寻求医疗保健建议的人，可以与患者进行自动疾病诊断的基于AI的对话代理是一个可行的选择。该应用需要有效地查询相关疾病症状，以便提出准确的诊断建议。可以将其作为顺序特征（症状）选择和分类的问题进行表述，并为其作为自然解决方案提出了增强学习方法（RL）方法。当特征空间很小时，它们的表现良好，也就是说，症状的数量和可诊断性疾病类别的数量有限，但是它们经常失败的特征分配失败。为了应对这一挑战，我们提出了一个由生成演员网络和诊断评论家网络组成的多模型融合的Actor-Critic（MMF-AC）RL框架。演员结合了变异自动编码器（VAE），以对特征的部分观察产生的不确定性进行建模，从而促进进行适当的查询。在评论家网络中，涉及一个监督疾病预测的诊断模型，以精确估计状态价值功能。此外，受鉴别诊断的医学概念的启发，我们结合了生成和诊断模型，以创建一种新型的奖励塑造机制，以解决大型搜索空间中稀疏奖励问题。我们对合成和现实世界数据集进行了广泛的实验，以进行经验评估。结果表明，我们的方法在诊断准确性和相互作用效率方面优于最先进的方法，同时对大型搜索空间更有效地扩展。此外，我们的方法适用于分类和连续功能，使其非常适合在线应用程序。

For those seeking healthcare advice online, AI based dialogue agents capable of interacting with patients to perform automatic disease diagnosis are a viable option. This application necessitates efficient inquiry of relevant disease symptoms in order to make accurate diagnosis recommendations. This can be formulated as a problem of sequential feature (symptom) selection and classification for which reinforcement learning (RL) approaches have been proposed as a natural solution. They perform well when the feature space is small, that is, the number of symptoms and diagnosable disease categories is limited, but they frequently fail in assignments with a large number of features. To address this challenge, we propose a Multi-Model-Fused Actor-Critic (MMF-AC) RL framework that consists of a generative actor network and a diagnostic critic network. The actor incorporates a Variational AutoEncoder (VAE) to model the uncertainty induced by partial observations of features, thereby facilitating in making appropriate inquiries. In the critic network, a supervised diagnosis model for disease predictions is involved to precisely estimate the state-value function. Furthermore, inspired by the medical concept of differential diagnosis, we combine the generative and diagnosis models to create a novel reward shaping mechanism to address the sparse reward problem in large search spaces. We conduct extensive experiments on both synthetic and real-world datasets for empirical evaluations. The results demonstrate that our approach outperforms state-of-the-art methods in terms of diagnostic accuracy and interaction efficiency while also being more effectively scalable to large search spaces. Besides, our method is adaptable to both categorical and continuous features, making it ideal for online applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题