1型糖尿病患者的离线增强学习可用于更安全的血糖控制

论文标题

1型糖尿病患者的离线增强学习可用于更安全的血糖控制

Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 Diabetes

论文作者

Emerson, Harry, Guy, Matthew, McConville, Ryan

论文摘要

对于1型糖尿病患者（T1D）的人来说，广泛采用有效的混合闭合循环系统将是一个重要的护理里程碑。这些设备通常利用简单的控制算法选择最佳的胰岛素剂量，以将血糖水平保持在健康范围内。在线增强学习（RL）已被用作进一步增强这些设备中葡萄糖控制的方法。与经典的对照算法相比，先前的方法已被证明可以降低患者的风险并改善目标范围的时间，但在学习过程中容易出现不稳定性，通常会导致选择不安全的动作。这项工作提出了对离线RL的评估，以制定有效的剂量政策，而无需在培训期间进行潜在危险的患者互动。本文研究了BCQ，CQL和TD3-BC在管理FDA批准的UVA/Padova葡萄糖动力学模拟器中可用的30名虚拟患者的血糖中的实用性。当接受在线RL所需的总训练样本以达到稳定性能所需的总训练样本的十分之一时，这项工作表明，与最强大的基线相比，离线RL可以显着增加61.6 +\ -0.3％至65.3％至65.3％至65.3 +/- 0.5％（P <0.001）。这是在低血糖事件中没有任何相关增加的情况下实现的。离线RL还可以证明能够纠正常见和具有挑战性的控制场景，例如不正确的推注剂量，不规则的进餐时间和压缩错误。

The widespread adoption of effective hybrid closed loop systems would represent an important milestone of care for people living with type 1 diabetes (T1D). These devices typically utilise simple control algorithms to select the optimal insulin dose for maintaining blood glucose levels within a healthy range. Online reinforcement learning (RL) has been utilised as a method for further enhancing glucose control in these devices. Previous approaches have been shown to reduce patient risk and improve time spent in the target range when compared to classical control algorithms, but are prone to instability in the learning process, often resulting in the selection of unsafe actions. This work presents an evaluation of offline RL for developing effective dosing policies without the need for potentially dangerous patient interaction during training. This paper examines the utility of BCQ, CQL and TD3-BC in managing the blood glucose of the 30 virtual patients available within the FDA-approved UVA/Padova glucose dynamics simulator. When trained on less than a tenth of the total training samples required by online RL to achieve stable performance, this work shows that offline RL can significantly increase time in the healthy blood glucose range from 61.6 +\- 0.3% to 65.3 +/- 0.5% when compared to the strongest state-of-art baseline (p < 0.001). This is achieved without any associated increase in low blood glucose events. Offline RL is also shown to be able to correct for common and challenging control scenarios such as incorrect bolus dosing, irregular meal timings and compression errors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题