使用手动姿势估计来自动开放手术训练反馈

论文标题

使用手动姿势估计来自动开放手术训练反馈

Using Hand Pose Estimation To Automate Open Surgery Training Feedback

论文作者

Bkheet, Eddie, D'Angelo, Anne-Lise, Goldbraikh, Adam, Laufer, Shlomi

论文摘要

目的：本研究旨在促进使用最先进的计算机视觉算法来进行外科医生的自动培训和手术镜头分析。通过估计2D手摆姿势，我们对从业者手的运动以及与手术工具的相互作用进行建模，以研究其在手术训练中的潜在好处。方法：我们利用公共可用的手数据集上的预训练模型来创建自己的内部数据集，其中包括100个带有2D手姿势的开放手术仿真视频。我们还评估了姿势估计将手术视频细分为手势和工具使用段的能力，并将其与运动学传感器和I3D功能进行比较。此外，我们介绍了来自域专家的培训建议的6种新型手术敏捷代理，我们的框架可以自动检测到原始视频片段。结果：开放手术模拟数据集的最新手势分割精度为88.35％，从多个角度融合了2D姿势和i3D特征。与专家相比，引入的手术技能代理对新手产生了显着差异，并产生了可行的反馈以进行改进。结论：这项研究证明了姿势估计通过分析其在手势分割和技能评估中的有效性来估计开放手术的好处。使用姿势估计的手势分割，在远程和无标记的同时，获得了与物理传感器的可比结果。依靠姿势估计的手术灵敏性代理证明它们可以用于努力用于自动化训练反馈。我们希望我们的发现鼓励有关新型技能代理的额外合作，以使外科训练更加高效。

Purpose: This research aims to facilitate the use of state-of-the-art computer vision algorithms for the automated training of surgeons and the analysis of surgical footage. By estimating 2D hand poses, we model the movement of the practitioner's hands, and their interaction with surgical instruments, to study their potential benefit for surgical training. Methods: We leverage pre-trained models on a publicly-available hands dataset to create our own in-house dataset of 100 open surgery simulation videos with 2D hand poses. We also assess the ability of pose estimations to segment surgical videos into gestures and tool-usage segments and compare them to kinematic sensors and I3D features. Furthermore, we introduce 6 novel surgical dexterity proxies stemming from domain experts' training advice, all of which our framework can automatically detect given raw video footage. Results: State-of-the-art gesture segmentation accuracy of 88.35\% on the Open Surgery Simulation dataset is achieved with the fusion of 2D poses and I3D features from multiple angles. The introduced surgical skill proxies presented significant differences for novices compared to experts and produced actionable feedback for improvement. Conclusion: This research demonstrates the benefit of pose estimations for open surgery by analyzing their effectiveness in gesture segmentation and skill assessment. Gesture segmentation using pose estimations achieved comparable results to physical sensors while being remote and markerless. Surgical dexterity proxies that rely on pose estimation proved they can be used to work towards automated training feedback. We hope our findings encourage additional collaboration on novel skill proxies to make surgical training more efficient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题