期刊文献+

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
下载PDF
导出
摘要 This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible. This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible.
出处 《Journal of Electronics(China)》 2011年第4期518-523,共6页 电子科学学刊(英文版)
基金 Supported by the National Natural Science Foundation of China (No. 60872105) the Program for Science & Technology Innovative Research Team of Qing Lan Project in Higher Educational Institutions of Jiangsu the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
关键词 Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) TIME-SCALE Voice conversion Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) Time-scale Voice conversion
  • 相关文献

参考文献10

  • 1黄昊,郭立,李琳.基于感知敏感成分划分的语音时长规整算法[J].数据采集与处理,2008,23(6):740-745. 被引量:4
  • 2Kana.High resolution voice conversion. . 2001 被引量:1
  • 3T. Toda,H. Saruwatari,K. Shikano.High quality voice conversion based on Gaussian mixture model with dynamic frequency warping. European Confer- ence on Speech Communication and Technology . 2001 被引量:1
  • 4Sawako Shibata,Hiroto Saito,Shogo Nakamura.A time scale modification using Hierarchical structure CIC filter and sinusoidal representation. 2005 RISP International Workshop on Nonlinear Circuits and Signal Proccssing . 2005 被引量:1
  • 5D. Erro,A. Moreno,A. Bonafonte.Voice con- version based on weighted frequency warping. IEEE Transactions on Audio,Speech,and Language Proc- essing . 2010 被引量:1
  • 6Srinivas Desai,E Veera Raghavendra,B. Yeg- nanarayana.Voice conversion using artificial neural networks. IEEE International Conference on Acous- tics Speed and Signal Processing Proceedings (ICASSP) . 2009 被引量:1
  • 7Allam Mousa.Voice conversion using pitch shifting algorithm by time stretching with PSOLA and re- sampling. Journal of Electrical Engineering . 20101 被引量:1
  • 8Arslan L.M,Talkin D.Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH . 1997 被引量:1
  • 9K.S.Lee."Statistical Approach for Voice Personality Transformation,". IEEE Trans.on audio,speech,and language processing . 2007 被引量:1
  • 10Kawahara H,Masuda-katsuse I,De Cheveign A.Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0extraction:possible role of a repetitive structure in sounds. Speech Communication . 1999 被引量:1

二级参考文献7

  • 1Wong P H W,Au, O C. Fast SOLA-based time-scale modification using modified envelope matching [C]//Proceedings of ICASSP 2002. Hong Kong, China:[s. n.],2002. 被引量:1
  • 2Makhoul J, El-jaroudi A. Time-scale modification in medium to low rate speech coding[J]. Proc ICASSP, 1986,311075-1078. 被引量:1
  • 3Philipos C L. Mimicking the human ear[J].IEEE Signal Processing Magazine, 1998,15(5) : 101-130. 被引量:1
  • 4Fmui S. On the role of spectral transition for speechperception[J].J Acoust Soc Amer, 1986, 80(4): 1016-1025. 被引量:1
  • 5Stevens K N. Acoustic correlates of some phonetic categories[J].J Acoust Soc Amer, 1980,68(3):836- 842. 被引量:1
  • 6Rabiner L, Juang B H. Fundamentals of speech recognition [M]. Englewood Cliffs, N J: Prentice-Hall, 1993: 100-117. 被引量:1
  • 7Deller J R, Hansen J H L, Proakis J G. Discretetime processing of speech signals[M]. New York, USA:Macmillan Publishing Company, 1993: 289-303. 被引量:1

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部