摘要
为了克服利用高斯混合模型(Gaussian mixture model,GMM)进行语音转换的过程中出现的过平滑现象,考虑到GMM模型参数的均值能够表征转换特征的频谱包络形状,提出一种基于GMM与人工神经网络(Artificial neural network,ANN)混合模型的语音转换。该方法利用ANN对GMM模型参数的均值进行转换;为了获取连续的转换频谱,采用静态和动态频谱特征相结合来逼近转换频谱序列;鉴于基频对语音转换的重要性,在频谱转换的基础上,对基频也进行了分析和转换。最后,通过主观和客观实验对提出的混合模型的语音转换方法的性能进行测试。实验结果表明,与传统的基于GMM模型的语音转换方法相比,本文提出的方法能够获得更好的转换语音。
As the mean vector of Gaussian mixture model (GMM) parameters can represent the basic shapes of converted feature vectors, based on a mixed model comprised of GMM and arti- ficial neural network (ANN), a novel spectral conversion method is proposed. The method al- leviates the over-smoothing problem by using ANN to transform the mean vector of GMM pa- rameters. Static and dynamic spectral features are used for approaching the converted spectrum sequence in order to gain the continuous converted spectral. Moreover, as pitch is very impor- tant to voice conversion, it is also analyzed and transformed on the basis of spectral conversion. The performance of the proposed method is evaluated using subjective and objective tests, and the results show that the proposed method can obtain a better speech quality than the earlier voice conversion system based on conventional GMM method.
出处
《数据采集与处理》
CSCD
北大核心
2014年第2期227-231,共5页
Journal of Data Acquisition and Processing
基金
江苏省高校自然科学研究(13KJA510003)重大资助项目
江苏高校优势学科建设工程(PAPD)资助项目
江苏省普通高校研究生科研创新计划(CXLX12_0478
CXZZ13_0488)资助项目
关键词
频谱转换
高斯混合模型
径向基函数
神经网络
spectral conversion
Gaussian mixture model
radial basis function
neural network