论文标题
无监督的面部动作单位强度通过可区分优化估算
Unsupervised Facial Action Unit Intensity Estimation via Differentiable Optimization
论文作者
论文摘要
来自单个图像的面部动作单元(AU)的自动强度估计在面部分析系统中起着至关重要的作用。数据驱动的AU强度估计的一个重大挑战是缺乏足够的AU标签数据。由于AU注释需要强大的领域专业知识,因此构建一个广泛的数据库来学习深层模型是昂贵的。标记的AUS数量有限,身份差异和构成差异进一步增加了估计困难。考虑到所有这些困难,我们提出了一个无监督的框架ge-net,用于从单个图像中进行面部AU强度估算,而无需任何带注释的AU数据。我们的框架执行可区分的优化,迭代更新面部参数(即头姿势,AU参数和身份参数)以匹配输入图像。 GE-NET由两个模块组成:一个发电机和特征提取器。发电机学会以一种可区分的方式从一组面部参数中“渲染”面部图像,并且功能提取器提取了深层特征,以测量渲染图像和输入真实图像的相似性。在训练和修复两个模块后,框架搜索最佳面部参数,通过最大程度地减少渲染图像和输入图像之间提取的特征的差异。实验结果表明,与现有方法相比,我们的方法可以实现最先进的结果。
The automatic intensity estimation of facial action units (AUs) from a single image plays a vital role in facial analysis systems. One big challenge for data-driven AU intensity estimation is the lack of sufficient AU label data. Due to the fact that AU annotation requires strong domain expertise, it is expensive to construct an extensive database to learn deep models. The limited number of labeled AUs as well as identity differences and pose variations further increases the estimation difficulties. Considering all these difficulties, we propose an unsupervised framework GE-Net for facial AU intensity estimation from a single image, without requiring any annotated AU data. Our framework performs differentiable optimization, which iteratively updates the facial parameters (i.e., head pose, AU parameters and identity parameters) to match the input image. GE-Net consists of two modules: a generator and a feature extractor. The generator learns to "render" a face image from a set of facial parameters in a differentiable way, and the feature extractor extracts deep features for measuring the similarity of the rendered image and input real image. After the two modules are trained and fixed, the framework searches optimal facial parameters by minimizing the differences of the extracted features between the rendered image and the input image. Experimental results demonstrate that our method can achieve state-of-the-art results compared with existing methods.