LWPOSR：轻巧有效的细粒头姿势估计

论文标题

LWPOSR：轻巧有效的细粒头姿势估计

LwPosr: Lightweight Efficient Fine-Grained Head Pose Estimation

论文作者

Dhingra, Naina

论文摘要

本文提出了一个轻巧的网络，用于头姿势估计（HPE）任务。虽然先前的方法依赖卷积神经网络，但提出的网络\ textIt {lwposr}使用了深度可分离卷积（DSC）和变压器编码器层的混合物，这些卷积（DSC）和变压器编码器层在两个流和三个阶段结构，以提供细节的回归，以预测头部姿势。提供了定量和定性演示，以表明所提出的网络能够在使用较少的参数空间的同时有效地学习头部姿势。大量消融使用三个开源数据集进行，即300W-LP，AFLW2000和BIWI数据集进行。据我们所知，（1）\ textit {lwposr}是与基于关键点和无键的方法相比，提出了用于估算头部姿势的最轻网络。（2）它设定了在平均绝对误差上表现过以前的轻量级网络的基准和减少参数数的基准；（3）首先将DSC和变压器编码器的混合物用于HPE。这种方法适用于需要轻量级网络的移动设备。

This paper presents a lightweight network for head pose estimation (HPE) task. While previous approaches rely on convolutional neural networks, the proposed network \textit{LwPosr} uses mixture of depthwise separable convolutional (DSC) and transformer encoder layers which are structured in two streams and three stages to provide fine-grained regression for predicting head poses. The quantitative and qualitative demonstration is provided to show that the proposed network is able to learn head poses efficiently while using less parameter space. Extensive ablations are conducted using three open-source datasets namely 300W-LP, AFLW2000, and BIWI datasets. To our knowledge, (1) \textit{LwPosr} is the lightest network proposed for estimating head poses compared to both keypoints-based and keypoints-free approaches; (2) it sets a benchmark for both overperforming the previous lightweight network on mean absolute error and on reducing number of parameters; (3) it is first of its kind to use mixture of DSCs and transformer encoders for HPE. This approach is suitable for mobile devices which require lightweight networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题