

Sparse group LASSO constraint eigenphone speaker adaptation method for speech recognition
摘要 本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组LASSO正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组LASSO约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组LASSO约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l1、l2和弹性网正则化方法。 Original eigenphone speaker adaptation method performed well when the amount of adaptation data was sufficient. However, it suffered from server overfitting when insufficient amount of adaptation data was provided. A sparse group LASSO(SGL) constraint eigenphone speaker adaptation method was proposed. Firstly, the principle of eigenphone speaker adaptation was introduced in case of hidden Markov model-Gaussian mixture model(HMM-GMM) based speech recognition system. Then, a sparse group LASSO was applied to estimation of the eigenphone matrix. The weight of the SGL norm was adjusted to control the complexity of the adaptation model. Finally, an accelerated proximal gradient method was adopted to solve the mathematic optimization. The method was compared with up-to-date norm algorithms. Experiments on an mandarin Chinese continuous speech recognition task show that, the performance of the SGL constraint eigenphone method can improve remarkably the performance of the system than original eigenphone method, and is also superior to l1-norm, l2-norm and elastic net constraint methods.
作者 屈丹 张文林
出处 《通信学报》 EI CSCD 北大核心 2015年第9期47-54,共8页 Journal on Communications
基金 国家自然科学基金资助项目(61175017 61302107 61403415)~~
关键词 说话人自适应 本征音子 组稀疏约束 稀疏组LASSO约束 近点梯度法 speaker adaptation eigenphone group sparse constraint sparse group LASSO constraint proximal gradient method
  • 相关文献


  • 1ZHANG W L, ZHANG W Q, LI B C, et al. Bayesian speaker adapta- tion based on a new hierarchical probabilistic model[J]. IEEE Transac- tions on Audio, Speech and Language Processing[J]. 2012, 20(7): 2002-2015. 被引量:1
  • 2SOLOMONOFF A, CAMPBELL W M, BOARDMAN I. Advances in channel compensation[A], for SVM speaker recognition. Proceedings of International Conference on Acoustics, Speech, and Signal Proc- essing(ICASSP)[C]. Philadelphia, USA, 2005.629-632. 被引量:1
  • 3PAVAN KUMAR D S, PRASAD N V, JOSHI V, et al. Modified splice and its extension to non-stereo data for noise robust speech recogni- tion[A]. Proceedings of IEEE Automatic Speech Recognition and Un- derstanding Workshop(ASRU)[C]. Olomouc, Czech Republic, 2013. 174-179. 被引量:1
  • 4HAMIDI S G, RICHARD C R. Two-stage speaker adaptation in sub- space gaussian mixture models[A]. Proceedings of International Con- ference on Acoustics, Speech and Signal Processing(ICASSP)[C]. Florence, Italy, 2014. 6374-6378. 被引量:1
  • 5WANG Y Q, GALE M J F. Tandem system adaptation using multiple linear feature transforms[A]. Proceedings of International Conference on Acoustics, Speech and Signal Processing(ICASSP)[C]. Vancouver, Canada, 2013.7932-7936. 被引量:1
  • 6KENNY P, BOULIANNE G, OUELLETET P, et al. Speaker adapta- tion using an eigenphone basis[J]. IEEE Transaction on Audio, Speech and Language Processing, 2004, 12(6):579-589. 被引量:1
  • 7ZHANG W L, ZHANG W Q, LIB C. Speaker adaptation based on speaker-dependent eigenphone estimation[A]. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop(ASRU)[C] Hawaii, USA, 2011.48-52. 被引量:1
  • 8LI J, TSAO Y, LEE, C H. Shrinkage model adaptation in automatic speech recognition[A]. Proceedings of Annual Conference on Interna- tional Speech Communication Association(INTERSPEECH)[C]. Ma- kuhari, Chiba, Japan, 2010. 1656-1659. 被引量:1
  • 9OLSEN P A, HUANG J, RENNIE S J, et al. Sparse maximum a pos- teriori adaptation[A]. Proceedings of IEEE Automatic Speech Recog- nition and Understanding Workshop(ASRU)[C]. Hawaii, USA, 2011. 53-58. 被引量:1
  • 10OLSEN P A, HUANG J, RENNIE S J, et al. Affine invariant sparse maximum a posteriori adaptation[A]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP)[C]. Kyoto, Japan, 2012.4317-4320. 被引量:1










使用帮助 返回顶部