期刊文献+

语音转换技术在电话语音识别中的应用研究(英文)

Speech Conversion for Telephone Speech Recognition
下载PDF
导出
摘要 提出了一种用语音转换技术改善电话语音识别性能的方法。通过模拟真实电话信道条件下影响语音质量的各种因素,实现由纯净语音到电话语音的转换。识别试验利用模拟电话语音评估了HMM识别器做MLLR自适应前后的性能。实验数据显示,自适应前由转换语音训练的模型识别率比由纯净语音训练的模型识别率增加了18.9%,而自适应试验表明,由转换语音训练而成的模型在MLLR自适应后,系统识别性能进一步得到改善,识别率增加了5.8%。识别实验表明所提语音转换方法可以减小由于真实电话语料缺乏而造成训练语音和测试语音声学性质的不匹配,从而有效地改善电话语音识别系统的性能。 A study on speech conversion technology is addressed to improve the telephone speech recognition performance. The speech conversion method is implemented by simulating the influential factors in actual telephone connections. MLLR adaptations are conducted to evaluate the performances of the HMM recognizers, which are trained from the clean speech and generated data respectively. The results without adaptation report that the models trained on generated data can give an 18.9% higher recognition rate than those on clean speech. The adaptation results show that MLLR algorithm contributes an extra increase of 5.8% to the recognition rate of telephone speech system. The experiments illustrate that telephone speech recognition performance can be effectively improved using the generated data, and the conversion method can reduce the acoustic mismatch between the training and test data, which is induced by the shortage of the actual telephone speech.
出处 《系统仿真学报》 EI CAS CSCD 北大核心 2005年第2期448-452,456,共6页 Journal of System Simulation
基金 国家自然科学基金项目(60172055 60121302) 北京市自然科学基金(4042025)。
关键词 语音转换 模拟电话语音 语音识别 MLLR speech conversion generated telephone speech speech recognition MLLR
  • 相关文献

参考文献1

二级参考文献56

  • 1H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173. 被引量:1
  • 2D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857. 被引量:1
  • 3P H Milenkovic.Voice source model for continuous control of pitch period[J].J Acoust Soc Am,1993,93(2):1087-1096. 被引量:1
  • 4H Matsumoto,et al.Multidimensional representation of personal quality of vowels and its acoustical correlates[J].IEEE Trans Audio and Electroacoustics,1973,21(5):428-436. 被引量:1
  • 5S Furui.Research on individuality features in speech waves and automatic speaker recognition techniques [J].Speech Communication,1986,5(2):183-197. 被引量:1
  • 6K S Lee,et al.A new voice transformation based on both linear and nonlinear prediction[A].Proc ICSLP[C].Philadelphia,USA:ESCA,1996.1401-1404. 被引量:1
  • 7L M Arslan.Speaker transformation algorithm using segmental codebooks (STASC)[J].Speech Communication,1999,28(3):211-226. 被引量:1
  • 8H Mizuno and M Abe.Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J].Speech Communication.1995,16(2):165-173. 被引量:1
  • 9T Yoshimura,et al.Speaker interpolation in HMM-based speech synthesis system[A].Proc.Eurospeech [C].Rhodes,Greece:ESCA,1997.2523-2526. 被引量:1
  • 10D G Childers.Glottal source modeling for voice conversion [J].Speech Communication.1995,16 (2):127-138. 被引量:1

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部