基于最大似然子带线性回归的鲁棒语音识别

Maximum Likelihood Sub-band Linear Regression for Robust Speech Recognition

下载PDF

导出

摘要在实际环境中,训练环境和测试环境的失配会导致语音识别系统的性能急剧恶化。模型自适应算法是减小环境失配影响的有效方法之一,它通过少量自适应数据将模型参数变换到识别环境。最大似然线性回归是一种常用的基于变换的模型自适应算法,本文针对最大似然线性回归算法在数据较少时模型参数估计不准确的缺点,提出了基于最大似然子带线性回归的模型自适应算法。该算法将Mel滤波器组的全部通道划分为若干个子带,假设每个子带内多个通道的模型均值分量共享一个线性环境变换关系,以增加可用的数据。实验表明,本文算法可以较好地克服数据稀疏问题,只需要很少的数据即可取得较好的自适应效果,尤其适合于少量数据时的快速模型自适应。 In real environments the performance of speech recognition system may be significantly degraded because of the mismatch between the training and testing conditions. Model adaptation is an efficient approach that could reduce this mismatch, which adapts model parameters to new conditions by a small amount of adaptation data. Maximum likelihood linear regression （MLLR） is a pop- ular transformation-based model adaptation algorithm. However it may degrade the performance of speech recognition system when only a few data are available. In this paper, a new model adaptation using maximum likelihood sub-band linear regression （MLSLR） is presen- ted, which divides the full channels of Mel filter bank into several sub-bands and uses linear function to approximate the relationship between training and testing mean vectors in every sub-band. The experimental results show that the proposed algorithm overcomes the sparse data problem preferably and requires only a small amount of data. Therefore,it is more useful for rapid model adaptation.

作者吕勇吴镇扬

机构地区东南大学信息科学与工程学院

出处《信号处理》 CSCD 北大核心 2010年第1期74-79,共6页 Journal of Signal Processing

基金国家973计划资助项目(2002CB312102)

关键词语音识别模型自适应最大似然子带线性回归隐马尔可夫模型 speech recognition model adaptation maximum likelihood sub-band linear regression hidden Markov model

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献11

1J. L. Gauvain, C. H. Lee. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains [ J ]. IEEE Trans. on Speech and Audio Processing, 1994,2:291-298. 被引量：1
2C. J. Leggetter, P. C. Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models [ J ]. Computer Speech and Language, 1995,9(2) : 171-185. 被引量：1
3C. Chesta, O. Siohan, C. H. Lee. Maximum a posteriori linear regression for hidden Markov model adaptation [ C ]. Proc. Eurospeech, 1999,2 : 803- 806. 被引量：1
4V. D. Diakoloukas, V. V. Digalakis. Maximum likelihood stochastic transformation adaptation of hidden Markov models [ J ]. IEEE Trans. on Speech and Audio Processing, 1999,7(2) :177-187. 被引量：1
5孙暐,吴镇扬,刘海滨.非线性统计匹配用于子带鲁棒语音识别[J].电子与信息学报,2006,28(3):480-484. 被引量：4
6Donglai Zhu, Satoshi Nakamura, Kuldip K. Paliwal, et al. Maximum likelihood sub-band adaptation for robust speech recognition [ J ]. Speech Communication,2005,47:243-264. 被引量：1
7M. J. F. Gales, S. J. Young. Robust speech recognition in additive and convolutional noise using parallel model combination[ J]. Computer Speech and Language, 1995,9 ( 4 ) : 289-307. 被引量：1
8D. Kim, D. Yook. Fast channel adaptation for continuous density HMMs using maximum likelihood spectral transform[ J]. Electronics Letters,2004,40(10) :632-633. 被引量：1
9D. Kim, D. Yook. Linear spectral transformation for robust speech recognition using maximum mutual information [ J ]. IEEE Signal Processing Letters,2007,14 (7) :496-499. 被引量：1
10P. J. Moreno, B. Raj, R. M. Stern. A vector Taylor series approach for environment-independent speech recognition [ C]. Proc. ICASSP, 1996:733-736. 被引量：1

二级参考文献13

1孙暐,吴镇扬,刘海滨,周琳.并行子带HMM最大后验概率自适应非线性类估计算法[J].电路与系统学报,2005,10(6):20-24. 被引量：1
2Cooke M, Morris A, Green E Missing data techniques for robust speech recognition[C]. ICASSP'97, Munich, Germany, 1997, vol 2:863 - 866. 被引量：1
3Diakoloukas V D, Digalakis V V. Maximum-likelihood stochastic-transformation adaptation of hidden Markov models[J].IEEE Trans. on Speech and Audio Processing. 1999, 7(2):177-187. 被引量：1
4Siohan O, Chesta C, Lee C -H. Hidden Markov model adaptation using maximum a posteriori linear regression[C]. In Workshop on Robust Methods for Speech Recognition in Adverse Conditions,Tampere, Finland, 1999: 147-150. 被引量：1
5Gales M, Young S. Cepstral parameter compensation for HMM recognition in noise[J]. Computer Speech and Language, 1993,12(3): 231-239. 被引量：1
6Sharma S R, Multistream approach to robust speech recognition[D/D]. Oregon Graduate Institute of Science and Technology, 1999.10. 被引量：1
7Tibrewala S, Hermansky H. Subband based recognition of noisy speech[C]. ICASSP'97, Munich, Germany, 1997, vol 2:1255-1258. 被引量：1
8Ji M, Smith F J. A probabilistic union model for subband based robust speech recognition[C]. ICASSP'00, Istanbul, Turkey, 2000,vol 3:1787-1790. 被引量：1
9Allen J B. How do humans process and recognize speech[J].IEEE Trans. on Speech and Audio Processing, 1994, 2(4):567 - 577. 被引量：1
10Dempster A P, Laird N M, Rubin D B. Maximum likelihood estimation from incomplete data[J], d Royal Statistical Society,Serials B, 1977, 39(1): 1 -38. 被引量：1

共引文献3

1赵忠彪,李文鑫,高荣.基于神经网络的矢量量化算法在语音辨识系统中的应用研究[J].河南科学,2008,26(7):839-841. 被引量：1
2吕勇,吴镇扬.基于最大似然多项式回归的鲁棒语音识别[J].声学学报,2010,35(1):88-96. 被引量：3
3LU Yong WU Zhenyang.Maximum likelihood polynomial regression for robust speech recognition[J].Chinese Journal of Acoustics,2011,30(3):358-370.

1吕勇.语音截止频率在语音识别中的应用[J].科技创新与应用,2013,3(36):300-300.
2吕勇,吴镇扬.基于最大似然多项式回归的鲁棒语音识别[J].声学学报,2010,35(1):88-96. 被引量：3
3吕勇,吴镇扬.基于矢量泰勒级数的模型自适应算法[J].电子与信息学报,2010,32(1):107-111. 被引量：2
4吴四清.基于MATLAB的二阶系统仿真与分析[J].咸宁学院学报,2009,29(3):79-80. 被引量：7
5赵俊霞,刘桥.对悬浮元器件的类Y参数变换[J].贵州大学学报（自然科学版）,2004,21(1):83-85.
6杨新凯,席裕庚.ATM网络中服务质量问题的完整视图[J].通信技术,1999,32(3):29-32.
7轩黎明,杨大成.基于导频信道进行传播模型校正的方法[J].无线电工程,2004,34(5):13-14. 被引量：2
8知名EDA大厂加入ARM全新快速模型计划[J].电子与电脑,2009,9(9):100-100.
9孙暐,吴镇扬.多带同步模型用于噪声环境下语音识别[J].中国工程科学,2006,8(3):31-34.
10吴敏健,徐佩霞.基于遗传算法的跳频信号盲估计[J].无线电工程,2005,35(10):3-5. 被引量：2

信号处理

2010年第1期

浏览历史

内容加载中请稍等...

基于最大似然子带线性回归的鲁棒语音识别

参考文献11

二级参考文献13

共引文献3

相关作者

相关机构

相关主题

浏览历史