基于GMM模型和LPC-MFCC联合特征的声道谱转换研究被引量：8

Research on vocal tract spectrum conversion based on GMM model and LPC-MFCC

下载PDF

导出

摘要声道谱转换是语音转换中的关键技术。目前,大多数语音转换方法对声道谱的转换都是先提取语音中的某一种声道特征参数,然后对其进行训练转换,进而合成转换语音。由于不同的声道特征参数表征着不同的物理和声学意义,因此这些方法通常忽略了不同声道特征参数之间可能存在的互补性。针对这一问题,研究了不同声道特征参数之间进行联合建模的方法,引入了一种由线性预测系数(LinearPredictionCoefficient,LPC)和梅尔频率倒谱系数(Mel-Frequency Cepstral Coefficient, MFCC)联合构成的LPC-MFCC特征参数,提出了一种基于高斯混合模型(Gaussian Mixture Model, GMM)和LPC-MFCC联合特征参数的语音转换方法。为验证文中方法的有效性,仿真实验选取了基于GMM和LPC的语音转换方法进行对比,对多组实验数据进行主观和客观测试,结果表明,文中提出的语音转换方法可以获得相似度更高的转换语音。 Spectrum conversion is a key technique in voice conversion. At present, most of vocal tract spectrum conversion methods are first to extract one of characteristic parameters of the vocal tract then to train and convert it, and finally to synthesize the converted voice. Since different characteristic parameters of the vocal tract characterize different physical and acoustic meanings, these methods usually ignore the possible complementary effects between different characteristic parameters. To solve this problem, this paper studies the joint modeling method between different characteristic parameters of vocal tract, and introduces a new characteristic parameter called LPC-MFCC which is composed of Linear Prediction Coefficient(LPC) and Mel-Frequency Cepstral Coefficient(MFCC). And then, a voice conversion method based on Gaussian Mixture Model(GMM) with LPC-MFCC is proposed. In order to verify the effectiveness of the proposed method, the voice conversion method based on GMM with LPC parameter is selected for comparison in simulation experiments. Subjective and objective tests are conducted with multiple sets of experimental data, and the results show that the proposed voice conversion method can achieve a higher similarity of voice conversion.

作者曾歆张雄伟孙蒙苗晓孔姚琨 ZENG Xin;ZHANG Xiongwei;SUN Meng;MIAO Xiaokong;YAO Kun(Army Engineering University,Nanjing 210007,Jiangsu,China)

机构地区陆军工程大学

出处《声学技术》 CSCD 北大核心 2020年第4期451-455,共5页 Technical Acoustics

基金国家自然科学基金(61471394) 江苏省优秀青年基金(BK20180080)资助项目。

关键词语音转换声道谱转换高斯混合模型联合建模线性预测系数-梅尔频率倒谱系数 voice conversion vocal tract spectrum conversion Gaussian Mixture Model(GMM) joint modeling Linear Prediction Coefficient-Mel-Frequency Cepstral Coefficient(LPC-MFCC)

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献4

1单燕燕.基于LPC和MFCC得分融合的说话人辨认[J].计算机技术与发展,2016,26(1):39-42. 被引量：4
2解伟超..语音转换中声道谱参数和基频变换算法的研究[D].南京邮电大学,2013:
3周莹..高质量语音转换系统中关键技术的研究[D].南京邮电大学,2012:
4王志卫,徐宁,刘小峰.一种基于码书映射的高效语音转换方法[J].微处理机,2014,35(1):65-69. 被引量：2

二级参考文献26

1Wu C H,Hsia C C,Liu T H. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis[J].IEEE Transactions on Audio Speech and Language Processing,2006,(04):1109-1116. 被引量：1
2Zuo G,Liu W. Genetic algorithm based RBF neural network for voice conversion[A].IEEE,2004.4215-4218. 被引量：1
3Desai S,Raghavendra E V,Yegnanarayana B. Voice conversion using artificial neural networks[A].2009.3893-3896. 被引量：1
4Stylianou Y,CappéO,Moulines E. Continuous probabilistic transform for voice conversion[J].IEEE Transactions on Speech and Audio Processing,1998,(02):131-142. 被引量：1
5Kain A B. High resolution voice transformation[D].Rockford College,2001. 被引量：1
6Stylianou Y,Cappe O. A system for voice conversion based on probabilistic classification and a harmonic plus noise model[A].1998.281-284. 被引量：1
7Arslan L M. Speaker transformation algorithm using segmental codebooks (STASC)[J].SPEECH COMMUNICATION,1999,(03):211-226. 被引量：1
8Abe M,Nakamura S,Shikano K. Voice conversion through vector quantization[A].1988.655-658. 被引量：1
9Erro D,Moreno A,Bonafonte A. Flexible harmonic/stochastic speech synthesis[A].2007. 被引量：1
10Zhi-Hua J,Zhen Y. Voice conversion using Viterbi algorithm based on Gaussian mixture model[A].2007.32-35. 被引量：1