学习微妙的本地表示，以估计多人姿势

论文标题

学习微妙的本地表示，以估计多人姿势

Learning Delicate Local Representations for Multi-Person Pose Estimation

论文作者

Cai, Yuanhao, Wang, Zhicheng, Luo, Zhengxiong, Yin, Binyi, Du, Angang, Wang, Haoqian, Zhang, Xiangyu, Zhou, Xinyu, Zhou, Erjin, Sun, Jian

论文摘要

在本文中，我们提出了一种称为“残差步骤网络（RSN）”的新方法。 RSN聚集具有相同空间尺寸（内部特征）的特征有效地获取精致的局部表示，从而保留了丰富的低水平空间信息并导致精确的关键点定位。此外，我们观察到输出功能对最终性能有所不同。为了解决这个问题，我们提出了一种有效的注意机制 - 姿势精炼机（PRM），以在输出功能的本地和全球表示之间进行权衡，并进一步完善关键点位置。我们的方法赢得了2019年可可关键挑战的第一名，并在不使用额外的培训数据和预验证的模型的情况下，在可可和MPII基准测试方面取得了最先进的结果。我们的单个模型在可可Test-DEV上达到78.6，在MPII测试数据集上93.0。结合模型在可可测试-DEV上达到79.2，在可可测试 - 挑战数据集上77.1。源代码可在https://github.com/caiyuanhao1998/rsn/上公开研究

In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in precise keypoint localization. Additionally, we observe the output features contribute differently to final performance. To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve 79.2 on COCO test-dev, 77.1 on COCO test-challenge dataset. The source code is publicly available for further research at https://github.com/caiyuanhao1998/RSN/

下载PDF全文

下载文献需遵守相关版权规定

论文标题