论文标题
姿势估计穿着宽松衣服的人:使用HFR相机和闪烁的LED获得地面真相姿势
Pose Estimation for Human Wearing Loose-Fitting Clothes: Obtaining Ground Truth Posture Using HFR Camera and Blinking LEDs
论文作者
论文摘要
人类的姿势估计,尤其是在运动员中,可以帮助提高其表现。但是,如果受试者穿着宽松的衣服(例如滑雪/滑雪板),则很难使用现有方法(例如人类注释)进行此估计。这项研究开发了一种方法,可以在人穿着松散衣服的二维(2D)姿势上获得地面真相数据。该方法使用快速冲洗的发光二极管(LED)。受试者必须穿松散的衣服,并将LED放在目标接头上。通过选择薄膜的松散衣服,直接使用相机观察到LED。提出的方法通过使用高框架速率摄像头捕获了240 fps的场景,并通过提取LED-ON和-OFF帧来渲染两个30 FPS映像序列。考虑到人类运动的速度,可以忽略两个视频序列之间的时间差异。 LED-ON视频用于手动注释关节,从而获得地面真相数据。此外,LED-OFF视频相当于30 fps的标准视频,证实了现有基于机器学习的方法和手动注释的准确性。实验表明,所提出的方法可以获取标准RGB视频的基础真实数据。此外,据透露,手动注释和最先进的姿势估计器均未获得目标关节的正确位置。
Human pose estimation, particularly in athletes, can help improve their performance. However, this estimation is difficult using existing methods, such as human annotation, if the subjects wear loose-fitting clothes such as ski/snowboard wears. This study developed a method for obtaining the ground truth data on two-dimensional (2D) poses of a human wearing loose-fitting clothes. This method uses fast-flushing light-emitting diodes (LEDs). The subjects were required to wear loose-fitting clothes and place the LED on the target joints. The LEDs were observed directly using a camera by selecting thin filmy loose-fitting clothes. The proposed method captures the scene at 240 fps by using a high-frame-rate camera and renders two 30 fps image sequences by extracting LED-on and -off frames. The temporal differences between the two video sequences can be ignored, considering the speed of human motion. The LED-on video was used to manually annotate the joints and thus obtain the ground truth data. Additionally, the LED-off video, equivalent to a standard video at 30 fps, confirmed the accuracy of existing machine learning-based methods and manual annotations. Experiments demonstrated that the proposed method can obtain ground truth data for standard RGB videos. Further, it was revealed that neither manual annotation nor the state-of-the-art pose estimator obtains the correct position of target joints.