期刊文献+

高斯PLDA在说话人确认中的应用及其联合估计 被引量:3

Gaussian PLDA for Speaker Verification and Joint Estimation
下载PDF
导出
摘要 近年来,基于总变化因子的说话人识别方法成为说话人识别领域的主流方法.其中,概率线性鉴别分析(Probabilistic linear discriminant analysis,PLDA)因其优异的性能而得到学者们的广泛关注.然而,在估计PLDA模型时,传统的因子分析方法只更新模型空间,因此,模型均值不能很好地与更新后的模型空间耦合.提出联合估计法对模型均值和模型空间同时估计,得到更为严格的期望最大化更新公式,在美国国家标准与技术局说话人识别评测2010扩展测试数据库以及2012核心测试数据库上,等错率得到一定提升. Recently the approaches based on i-vector have become very popular in the speaker recognition domain. Among these methods, the probabilistic linear discriminant analysis (PLDA) has attracted much attention due to its promising performance. However, the traditional factor analysis method only updates model space, thus making model mean couple with the model space unsuitably. This paper propose an approach of joint estimation for both model mean and model space, resulting in more strict expectation maximization (EM) formula. The equal error rate has been improved on the NIST SRE 2010 extended test corpus and NIST SRE 2012 core test corpus.
出处 《自动化学报》 EI CSCD 北大核心 2014年第6期1068-1074,共7页 Acta Automatica Sinica
基金 国家高技术研究发展计划(863计划)(2012AA012503) 国家自然科学基金(10925419 90920302 61072124 11074275 11161140319 91120001 61271426) 中国科学院战略性先导科技专项(XDA06030100 XDA06030500) 中科院重点部署项目(KGZDEW-103-2)资助~~
关键词 因子分析 总变化因子 概率线性鉴别分析 联合估计 期望最大化 Factor analysis, i-vector, probabilistic linear discriminant analysis (PLDA), joint estimation, expectationmaximization (EM)
  • 相关文献

参考文献22

  • 1Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41. 被引量:1
  • 2郭武,李轶杰,戴礼荣,王仁华.说话人识别中的因子分析以及空间拼接[J].自动化学报,2009,35(9):1193-1198. 被引量:14
  • 3Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Speech Audio Processing, 2005, 13(3): 345-359. 被引量:1
  • 4Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(4): 1435-1447. 被引量:1
  • 5何亮,史永哲,刘加.联合因子分析中的本征信道空间拼接方法[J].自动化学报,2011,37(7):849-856. 被引量:8
  • 6Dehak N. Discriminative and generative approches for long-and short-term speaker characteristics modeling: Application to speaker verification [Ph.D. dissertation], école de Technologie Supérieure, Montreal, QC, Canada, 2009. 被引量:1
  • 7Dehak N, Kenny P, Dehak R, Dumouchel P, Ouellet P. Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 2011, 19(4): 788-798. 被引量:1
  • 8McLaren M, Leeuwen D A V. Sourcenormalised and weighted lda for robust speaker recognition using i-vectors. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011. 5456-5459. 被引量:1
  • 9Simon J D P, James H E. Probabilistic linear discriminant analysis for inferences about identity. In: Proceedings of International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007. 1-8. 被引量:1
  • 10Dehak N, Karam Z, Reynolds D, Dehak R, Campbell W, Glass J. A channel-blind system for speaker verification. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011. 4536-4539. 被引量:1

二级参考文献22

  • 1Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 2000, 10(1-3): 19-41. 被引量:1
  • 2Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing; Letters, 2006, 13(5): 308-311. 被引量:1
  • 3Kenny P, Boulianne G, Ouellet P, Dumouchel P. Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1448-1460. 被引量:1
  • 4Vogt R, Sridharan S. Experiments in session variability modeling for speaker verification. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing. Toulouse, France: IEEE, 2006. 897-900. 被引量:1
  • 5Castaldo F, Colibro D, Dalmasso E, Laface P, Vair C. Compensation of nuisance factors for speaker and language recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(7): 1969-1978. 被引量:1
  • 6Kenny P, Ouellet P, Dehak N, Gupta V, Dumouchel P. A study of inter-speaker variability in speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(5): 980-988. 被引量:1
  • 7Kenny P, Boulianne G, Dumouchel P. Eigenvoice modeling with sparse training data. IEEE Transactions on Audio, Speech, and Lnnguage Processing, 2005, 13(3): 345-354. 被引量:1
  • 8Kenny P, Boulianne G, Ouellet P, Dumouchel P. Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447. 被引量:1
  • 9NIST. The NIST Year 2008 Speaker Recognition Evaluation Plan [Online], available: http://www.nist.gov/speech/tests /sre/2008/index.html, March 20, 2008. 被引量:1
  • 10Bishop C M. Pattern Recognition and Machine Learning. Berlin: Springer, 2008. 583-586. 被引量:1

共引文献17

同被引文献26

  • 1KINNUNEN T, LI H ZH. An overview of text-independent speaker recognition: From features to super-vectors [ J ]. Speech Communication, 2010, 52( 1 ) : 12- 40. 被引量:1
  • 2GONZALEZ-RODRIGUEZ J. Evaluating automatic speaker recognition systems: An overview of the NIST speaker recognition evaluations ( 1996-2014 ) [ J ]. Lo- quens, 2014, 1 ( 1 ) : 1-15. 被引量:1
  • 3KHOURY E, VESNICER B, FRANCO-PEDROSO J, et al. The 2013 speaker recognition evaluation in mobile en- vironment[ C ]. Proceedings of IAPR International Con- ference on Biometrics (ICB), 2013: 1-8. 被引量:1
  • 4KENNY P, BOULIANNE G, OUELLET P, et al. Joint factor analysis versus eigenchannels in speaker recogni- tion[J]. IEEE Transactions on Audio, Speech and Lan- guage Processing, 2007, 15(4) : 1435-1447. 被引量:1
  • 5DEHAK N, KENNY P, DEHAK R, et al. Front-end factor analysis for speaker verification [ J ]. IEEE Trans-actions on Audio, Speech, and Language Processing, 2011, 19(4) : 788-798. 被引量:1
  • 6MCLAREN M, LEEUWEN D V. Source normalised and weighted LDA for robust speaker recognition u- sing i-veetors[ C ]. IEEE International Conferenee on Acoustics Speech and Signal Processing (ICASSP) , 2011:5456 -5459. 被引量:1
  • 7KANAGASUNDARAM A, DEANA D, SRIDHARAN S, et al. I-vector based speaker recognition using advanced channel compensation techniques [ J ]. Computer Speech and Language, 2014, 28( 1 ) : 121-140. 被引量:1
  • 8KENNY P. Bayesian speaker verification with heavy tailed priot~ [ C]. Proceedings of the Speaker and Lan- guage Recognition Workshop, 2010: 1-10. 被引量:1
  • 9HASAN T, HANSEN J H L. Maximum likelihood acous- tic factor analysis models for robust speaker verification in noise[ J ]. 1EEE Transactions on Audio, Speech, and l.anguage Processing, 2014, 22(2): 381-391. 被引量:1
  • 10邱政权,范小春,王俊年.基于维纳滤波和混合模型的说话人识别[J].仪器仪表学报,2009,30(7):1436-1440. 被引量:5

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部