摘要
提出一种基于语音驱动的全自动虚拟形象合成方案,只需输入语音,系统可以自动生成匹配的口型和表情。通过主观打分和客观评价指标,证明了本系统合成效果达到了人类视觉无法区分的真实度。
This paper proposes an automatic virtual image synthesis scheme based on speech drive.The system can automatically generate matching mouth shape and expression by inputting speech.Through subjective scoring and objective evaluation index,it is proved that the synthetic effect of this system has reached the reality that human vision cannot distinguish.
作者
贺晓光
He Xiaoguang(iFLYTEK,Hefei 230000,China)
出处
《安徽电子信息职业技术学院学报》
2021年第1期25-28,共4页
Journal of Anhui Vocational College of Electronics & Information Technology
关键词
虚拟形象
多模态
视频合成
语音合成
Vrtual image
Multimodality
Video synthesis
Speech synthesis