论文标题
在实际临床环境中检测引用的糖尿病性视网膜病的深度学习系统的性能
Performance of a deep learning system for detection of referable diabetic retinopathy in real clinical settings
论文作者
论文摘要
背景:为了确定商业可用的深度学习系统的能力,Retcad v.1.3.1(Thirona,Thirona,Nijmegen,荷兰)自动检测在三级医院筛选程序中,可以释放的智力逐步启动的临床练习中获得的临床练习中获得的临床练习,从而使人工启动的锻炼,分析了该人的锻炼,从而使锻炼均可进行锻炼,从而在临时临床实践中获得的彩色底面图像的数据集自动检测。方法:该软件的评估是在2月至2019年12月之间参加我们筛查计划的3189名糖尿病患者的6325眼的7195个非乳清眼睛图像的数据集进行的。该软件为每个颜色的底面图像产生了DR的严重性评分,这些彩色底面图像合并为眼部评分。然后将该分数与使用接收器操作特征(ROC)曲线分析的人类专家设定的参考标准进行比较。结果:人工智能(AI)软件在ROC曲线(AUC)值下达到了0.988 [0.981:0.993],以检测引用的DR。在拟议的操作点,RETCAD软件对DR的敏感性为90.53%,特异性为97.13%。减少96%的工作量可以以仅6个假否定为代价。结论:AI软件正确识别了绝大多数引用的DR案件,工作量减少了96%的案件,而这些案件需要检查,而几乎没有真正的案例,因此可以用作分类的工具。
Background: To determine the ability of a commercially available deep learning system, RetCAD v.1.3.1 (Thirona, Nijmegen, The Netherlands) for the automatic detection of referable diabetic retinopathy (DR) on a dataset of colour fundus images acquired during routine clinical practice in a tertiary hospital screening program, analyzing the reduction of workload that can be released incorporating this artificial intelligence-based technology. Methods: Evaluation of the software was performed on a dataset of 7195 nonmydriatic fundus images from 6325 eyes of 3189 diabetic patients attending our screening program between February to December of 2019. The software generated a DR severity score for each colour fundus image which was combined into an eye-level score. This score was then compared with a reference standard as set by a human expert using receiver operating characteristic (ROC) curve analysis. Results: The artificial intelligence (AI) software achieved an area under the ROC curve (AUC) value of 0.988 [0.981:0.993] for the detection of referable DR. At the proposed operating point, the sensitivity of the RetCAD software for DR is 90.53% and specificity is 97.13%. A workload reduction of 96% could be achieved at the cost of only 6 false negatives. Conclusions: The AI software correctly identified the vast majority of referable DR cases, with a workload reduction of 96% of the cases that would need to be checked, while missing almost no true cases, so it may therefore be used as an instrument for triage.