期刊文献+

基于最大似然子带线性回归的鲁棒语音识别

Maximum Likelihood Sub-band Linear Regression for Robust Speech Recognition
下载PDF
导出
摘要 在实际环境中,训练环境和测试环境的失配会导致语音识别系统的性能急剧恶化。模型自适应算法是减小环境失配影响的有效方法之一,它通过少量自适应数据将模型参数变换到识别环境。最大似然线性回归是一种常用的基于变换的模型自适应算法,本文针对最大似然线性回归算法在数据较少时模型参数估计不准确的缺点,提出了基于最大似然子带线性回归的模型自适应算法。该算法将Mel滤波器组的全部通道划分为若干个子带,假设每个子带内多个通道的模型均值分量共享一个线性环境变换关系,以增加可用的数据。实验表明,本文算法可以较好地克服数据稀疏问题,只需要很少的数据即可取得较好的自适应效果,尤其适合于少量数据时的快速模型自适应。 In real environments the performance of speech recognition system may be significantly degraded because of the mismatch between the training and testing conditions. Model adaptation is an efficient approach that could reduce this mismatch, which adapts model parameters to new conditions by a small amount of adaptation data. Maximum likelihood linear regression (MLLR) is a pop- ular transformation-based model adaptation algorithm. However it may degrade the performance of speech recognition system when only a few data are available. In this paper, a new model adaptation using maximum likelihood sub-band linear regression (MLSLR) is presen- ted, which divides the full channels of Mel filter bank into several sub-bands and uses linear function to approximate the relationship between training and testing mean vectors in every sub-band. The experimental results show that the proposed algorithm overcomes the sparse data problem preferably and requires only a small amount of data. Therefore,it is more useful for rapid model adaptation.
作者 吕勇 吴镇扬
出处 《信号处理》 CSCD 北大核心 2010年第1期74-79,共6页 Journal of Signal Processing
基金 国家973计划资助项目(2002CB312102)
关键词 语音识别 模型自适应 最大似然子带线性回归 隐马尔可夫模型 speech recognition model adaptation maximum likelihood sub-band linear regression hidden Markov model
  • 相关文献

参考文献11

  • 1J. L. Gauvain, C. H. Lee. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains [ J ]. IEEE Trans. on Speech and Audio Processing, 1994,2:291-298. 被引量:1
  • 2C. J. Leggetter, P. C. Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models [ J ]. Computer Speech and Language, 1995,9(2) : 171-185. 被引量:1
  • 3C. Chesta, O. Siohan, C. H. Lee. Maximum a posteriori linear regression for hidden Markov model adaptation [ C ]. Proc. Eurospeech, 1999,2 : 803- 806. 被引量:1
  • 4V. D. Diakoloukas, V. V. Digalakis. Maximum likelihood stochastic transformation adaptation of hidden Markov models [ J ]. IEEE Trans. on Speech and Audio Processing, 1999,7(2) :177-187. 被引量:1
  • 5孙暐,吴镇扬,刘海滨.非线性统计匹配用于子带鲁棒语音识别[J].电子与信息学报,2006,28(3):480-484. 被引量:4
  • 6Donglai Zhu, Satoshi Nakamura, Kuldip K. Paliwal, et al. Maximum likelihood sub-band adaptation for robust speech recognition [ J ]. Speech Communication,2005,47:243-264. 被引量:1
  • 7M. J. F. Gales, S. J. Young. Robust speech recognition in additive and convolutional noise using parallel model combination[ J]. Computer Speech and Language, 1995,9 ( 4 ) : 289-307. 被引量:1
  • 8D. Kim, D. Yook. Fast channel adaptation for continuous density HMMs using maximum likelihood spectral transform[ J]. Electronics Letters,2004,40(10) :632-633. 被引量:1
  • 9D. Kim, D. Yook. Linear spectral transformation for robust speech recognition using maximum mutual information [ J ]. IEEE Signal Processing Letters,2007,14 (7) :496-499. 被引量:1
  • 10P. J. Moreno, B. Raj, R. M. Stern. A vector Taylor series approach for environment-independent speech recognition [ C]. Proc. ICASSP, 1996:733-736. 被引量:1

二级参考文献13

  • 1孙暐,吴镇扬,刘海滨,周琳.并行子带HMM最大后验概率自适应非线性类估计算法[J].电路与系统学报,2005,10(6):20-24. 被引量:1
  • 2Cooke M, Morris A, Green E Missing data techniques for robust speech recognition[C]. ICASSP'97, Munich, Germany, 1997, vol 2:863 - 866. 被引量:1
  • 3Diakoloukas V D, Digalakis V V. Maximum-likelihood stochastic-transformation adaptation of hidden Markov models[J].IEEE Trans. on Speech and Audio Processing. 1999, 7(2):177-187. 被引量:1
  • 4Siohan O, Chesta C, Lee C -H. Hidden Markov model adaptation using maximum a posteriori linear regression[C]. In Workshop on Robust Methods for Speech Recognition in Adverse Conditions,Tampere, Finland, 1999: 147-150. 被引量:1
  • 5Gales M, Young S. Cepstral parameter compensation for HMM recognition in noise[J]. Computer Speech and Language, 1993,12(3): 231-239. 被引量:1
  • 6Sharma S R, Multistream approach to robust speech recognition[D/D]. Oregon Graduate Institute of Science and Technology, 1999.10. 被引量:1
  • 7Tibrewala S, Hermansky H. Subband based recognition of noisy speech[C]. ICASSP'97, Munich, Germany, 1997, vol 2:1255-1258. 被引量:1
  • 8Ji M, Smith F J. A probabilistic union model for subband based robust speech recognition[C]. ICASSP'00, Istanbul, Turkey, 2000,vol 3:1787-1790. 被引量:1
  • 9Allen J B. How do humans process and recognize speech[J].IEEE Trans. on Speech and Audio Processing, 1994, 2(4):567 - 577. 被引量:1
  • 10Dempster A P, Laird N M, Rubin D B. Maximum likelihood estimation from incomplete data[J], d Royal Statistical Society,Serials B, 1977, 39(1): 1 -38. 被引量:1

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部