期刊文献+

面向非受控场景的人脸图像正面化重建

Face frontalization for uncontrolled scenes
原文传递
导出
摘要 目的人脸正面化重建是当前视觉领域的热点问题。现有方法对于模型的训练数据具有较高的需求,如精确的输入输出图像配准、完备的人脸先验信息等。但该类数据采集成本较高,可应用的数据集规模较小,直接将现有方法应用于真实的非受控场景中往往难以取得理想表现。针对上述问题,提出了一种无图像配准和先验信息依赖的任意视角人脸图像正面化重建方法。方法首先提出了一种具有双输入路径的人脸编码网络,分别用于学习输入人脸的视觉表征信息以及人脸的语义表征信息,两者联合构造出更加完备的人脸表征模型。随后建立了一种多类别表征融合的解码网络,通过以视觉表征为基础、以语义表征为引导的方式对两种表征信息进行融合,融合后的信息经过图像解码即可得到最终的正面化人脸图像重建结果。结果首先在Multi-PIE(multi-pose,illumination and expression)数据集上与8种较先进方法进行了性能评估。定量和定性的实验结果表明,所提方法在客观指标以及视觉质量方面均优于对比方法。此外,相较于当前性能先进的基于光流的特征翘曲模型(flow-based feature warping model,FFWM)方法,本文方法能够节省79%的参数量和42%的计算操作数。进一步基于CASIA-WebFace(Institute of Automation,Chinese Academy of Sciences—WebFace)数据集对所提出方法在真实非受控场景中的表现进行了评估,识别精度超过现有方法10%以上。结论本文提出的双层级表征集成推理网络,能够挖掘并联合人脸图像的底层视觉特征以及高层语义特征,充分利用图像自身信息,不仅以更低的计算复杂度取得了更优的视觉质量和身份识别精度,而且在非受控的场景下同样展现出了出色的泛化性能。 Objective The issue of uncontrolled-scenes-oriented human face recognition is challenged of series of uncontrollable factors like image perspective changes and face pose variations.Facial images reconstruction enables the interface between uncontrolled scenarios and matured recognition techniques.It aims to synthesize a standardized facial image derived from an arbitrary light and pose face image.The reconstructed facial image can be as a commonly used human face recognition method with no additional introduced inference.Beyond a pre-processing model of facial imaging contexts(e.g.,recognition,semantic parsing,and animation generation,etc.),it has potentials in virtual and augmented reality like facial clipping,decoration and reconstruction.It is challenging to pursue 3 D-rotation-derived predictable objects and the same of preserved identity for multi-view generations.Many classical tackling approaches have been proposed,which can be categorized into model-driven-based approaches,data-driven-based approaches,and a combination of both.Recent generative adversarial networks(GANs)have shown good results in multi-view generation.However,some high requirements of these methods have to be resolved in the training dataset,such as accurate input and output of image alignment and rich facial prior.We facilitate a novel facial reconstruction method beyond its image alignment and prior information.Method Our two-level representation integration inference network is composed of three aspects on a high-level facial semantic information encoder,a low-level facial visual information encoder,and an integrated multi-information decoder.The encoding process is concerned of the learning issue of richer identity representation information in terms of an arbitrary-posed facial image.The convolution weights of the pre-trained face recognition model is melted into our semantic encoder.The face recognition model is trained on a large-scale dataset,which enables the encoder to adapt complex face variations through facial prior knowledge
作者 辛经纬 魏子凯 王楠楠 李洁 高新波 Xin Jingwei;Wei Zikai;Wang Nannan;Li Jie;Gao Xinbo(School of Telecommunications Engineering,Xidian University,Xi'an 710071,China;School of Electronic Engineering,Xidian University,Xi'an 710071,China;Chongqing Key Laboratory of Image Cognition,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处 《中国图象图形学报》 CSCD 北大核心 2022年第9期2788-2800,共13页 Journal of Image and Graphics
基金 国家自然科学基金项目(62176195,62036007,61922066,61876142)。
关键词 人脸正面化重建 任意姿态 双编码路径 视觉表征 语义表征 融合算法 face frontalization arbitrary pose dual encoding path visual representation semantic representation fusion algorithm
  • 相关文献

参考文献4

二级参考文献14

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部