期刊文献+

基于隐马尔可夫模型和高斯混合模型结合的声音转换方法 被引量:5

Voice Conversion with the Combination of HMM and GMM
下载PDF
导出
摘要 针对隐马尔可夫模型较强的语音信号表征能力和高斯混合模型良好的声音转换效果,提出了一种了隐马尔可夫模型和高斯混合模型相结合转换线谱频率的方法,给出了理论推导和算法流程,并利用高斯建模实现了韵律特征的转换。利用所述算法对录制的两段语音进行了仿真实验,转换语音有较好的自然度和清晰度,ABX测试结果显示,文中算法得到的语音在听觉上有90.2%的概率更接近目标说话人语音。 According to hidden strong representation capability of Markov model (HMM) speech signal and better conversion effect Gaussian mixture model (GMM) ,an approach for line-spectrum frequency transformation using HMM and GMM is presented, and the theoreti- cal derivation and the flow diagram of the algorithm are offered. Then, Gaussian model is introduced to achieve the prosodic feature transformation. The experiment is applied on two segment speech. The experimental result shows that the converted speech has good naturalness and articulation. The ABX test indicates that the converted speech is 90.2% similar to the that of the target speaker.
出处 《数据采集与处理》 CSCD 北大核心 2009年第3期285-289,共5页 Journal of Data Acquisition and Processing
关键词 声音转换 线谱频率 隐马尔可夫模型 高斯混合模型 主观评价 voice conversion line-sepctrum frequency hidde Markov model Gaussian mixture model(GMM) subjective evaluation
  • 相关文献

参考文献10

  • 1Abe M,Nakamura S,Shikano K,et al. Voice conversion through vector quantization[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. New York:IEEE,1988..655- 658. 被引量:1
  • 2Stylianou Y,Cappe O, Moulines E. Continuos probabilistic transform for voice conversion[J]. IEEE Speech and Audio Processing, 1998, 6(20): 131- 142. 被引量:1
  • 3Lee Ki-Seung. Statistical approach for voice personality transformation [J]. IEEE Transactions on Au- dio, Speech and Language Processing, 2007, 15 (2) :641-651. 被引量:1
  • 4Chu M, Lin H, Jie P X,et al. Voice conversion between female and male in a TD-PSOLA based Chi- nese TTS system[C]//Proceedings of the 5th International Conference on Spoken Language Processing. Singapore:[s. n.],1998,26:113-117. 被引量:1
  • 5刘立..基于时域的男女声语音转换新途径的研究[D].中国科学院声学研究所,2000:
  • 6岳振军,王浩,张雄伟.基于正弦谐波模型和BP神经网络的语音变换算法及实现[J].信号处理,2005,21(z1):208-211. 被引量:7
  • 7李波,王成友,蔡宣平,张尔扬.LPC与LSF转换算法的比较研究[J].信号处理,2004,20(5):521-524. 被引量:1
  • 8左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 9张雄伟 ... ..现代语音处理技术及应用[M],2003.
  • 10宋巍.基于支持向量回归的说话人变换技术[D].南京:南京理工大学通信工程学院,2007. 被引量:1

二级参考文献70

  • 1王浩,刘春林,岳振军.一种基于基音周期调整的简单语音变换技术[J].军事通信技术,2004,25(4):1-5. 被引量:3
  • 2[2]BRIEN D, MONAGHAN A I C. Concatenative synthesis based on a harmonic model[J]. IEEE Transactions on speech and Audio Processing, 2001.9 (1): 11-20. 被引量:1
  • 3[3]Mcaulay R J, Quatieri T F. Speech analysis/synthesis based on a sinusoidal representation[J]. IEEE Transactions on Acoustics, Speech,and Signal Processing, 1986, 34(4): 744-754. 被引量:1
  • 4[4]Turk O. New methods for voice conversion. Master Degree Thesis of Science[D].Bogazis University, 2003. 被引量:1
  • 5[5]Cappe O, Moulines E. Regularization techniques for discrete cepstrum estimation[J]. IEEE signal processing letters, 1996,3(4): 100-102. 被引量:1
  • 6[6]Inanoglu Z. Transforming pitch in a voice conversion framework[D].Cambridge:College University of Cabridge,2003. 被引量:1
  • 7[8]张颖,刘艳秋.软计算方法[D].北京:科学出版社,2002,171-180. 被引量:1
  • 8[9]赵胜辉,刘家康,谢湘等译.离散时间语音信号处理[D].北京:电子工业出版社,2004,333-392. 被引量:1
  • 9H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173. 被引量:1
  • 10D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857. 被引量:1

共引文献36

同被引文献50

  • 1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报(自然科学版),2009(S1):1408-1412. 被引量:3
  • 2Stylianou Y. Voice transformation: a survey [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. China: IEEE, 2009: 3585- 3588. 被引量:1
  • 3Abe M, Nakamura S, Shikano K, et al. Voice con version through vector quantization [C]//IEEE In ternational Conference on Acoustics, Speech and Sig nal Processing. Seattle, Washington: IEEE, 1988 655-658. 被引量:1
  • 4Stylianou Y, Cappe O, Moulines E. Continuous probabilistic transform for voice conversion [J].IEEE Transactions on Speech and Audio Processing, 1998, 6(2): 131-142. 被引量:1
  • 5Yamagishi J, Kobayashi T, Nakano Y, et al. Analy- sis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adapta- tion algorithm [J]. IEEE Transactions on Audio, Speech and Language Processing, 2009, 17(1): 66- 83. 被引量:1
  • 6Erro D, Moreno A, Bonafonte A. Voice conversion based on weighted frequency warping[J]. IEEE Transactions on Audio, Speech and Language Pro- cessing, 2010, 18(5): 922-931. 被引量:1
  • 7Desai S, Black A W, Yegnanarayana B, et al. Spec- tral mapping using artificial neural networks for voice conversion [J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(5): 954-964. 被引量:1
  • 8Duxans H, Bonafonte A, Kain A, et al. Including dynamic and phonetic information in voice conversion systems [C]//8th International Conference on Spo- ken Language Processing. Jeju Island, Korea: [s. n. ], 2004: 5-8. 被引量:1
  • 9Toda T, Black A W, Tokuda K. Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory [J]. IEEE Transactions on Au- dio, Speech and Language Processing, 2007, 15 (8): 2222-2235. 被引量:1
  • 10Zen H, Nankaku Y, Tokuda K. Continuous stochastic feature mapping based on trajectory HMMs [J]. IEEE Transactions on Audio, Speech and Language Processing, 2011, 19(2): 417-430. 被引量:1

引证文献5

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部