期刊文献+

基于深层置信网络的说话人信息提取方法 被引量:5

Deep Belief Network Based Speaker Information Extraction Method
下载PDF
导出
摘要 在基于全差异空间因子(i-Vector)的说话人确认系统中,需进一步从语音段的i-Vector表示中提取说话人相关的区分性信息,以提高系统性能.文中通过结合锚模型的思想,提出一种基于深层置信网络的建模方法.该方法通过对i-Vector中包含的复杂差异信息逐层进行分析、建模,以非线性变换的形式挖掘出其中的说话人相关信息.在NIST SRE 2008核心测试电话训练-电话测试数据库上,男声和女声的等错误率分别为4.96%和6.18%.进一步与基于线性判别分析的系统进行融合,能将等错误率降至4.74%和5.35%. In i-vector based speaker verification system, it is necessary to extract the discriminative speaker information from i-vectors to further improve the performance of the system. Combined with the anchor model, a deep belief network based speaker-related information extraction method is proposed in this paper. By analyzing and modeling the complex variabilities contained in i-vectors layer-by-layer, the speaker-related information can be extracted with non-linear transformation. The experimental results on the core test of NIST SRE 2008 show the superiority of the proposed method. Compared with the linear discriminant analysis based system, the equal error rates(EER) of male and female trials can be reduced to 4.96% and 6.18% respectively. Furthermore, after the fusion of the proposed method with linear discriminant analysis, the EER can be reduced to 4.74% and 5.35%.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2013年第12期1089-1095,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61273264) 国家973前期研究专项项目(No.2012CB326405)资助
关键词 全差异空间因子 说话人确认 深层置信网络 锚模型 i-Vector, Speaker Verification, Deep Belief Network, Anchor Model
  • 相关文献

参考文献16

  • 1Reynolds D A, Quatieri T F, Dunn R. Speaker Verification Using Adapted Gaussian Mixture Model. Digital Signal Processing, 2000, 10(112/3): 19-41. 被引量:1
  • 2Kenny P, Ouellet P, Dehak N, et al. A Study of Interspeaker Varia?bility in Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2008, 16(5) : 980-988. 被引量:1
  • 3Dehak N, Kenny P, Dehak R, et al. Front-End Factor Analysis for Speaker Verification. IEEE Trans on Audio, Speech and Language Processing, 2011, 19(4): 788-798. 被引量:1
  • 4Fukunaga K. Introduction to Statistical Pattern Recognition. 2nd Edition. New York, USA: Academic Press, 1990. 被引量:1
  • 5Hatch A 0, Stolcke A. Generalized Linear Kernels for One-versus?All Classification: Application to Speaker Recognition II Proc of the International Conference on Acoustics, Speech and Signal Proce?ssing. Toulouse, France, 2006: 585 -588. 被引量:1
  • 6Mohammed A, Sainath T, Dahl G, et al. Deep Belief Networks Using Discriminative Features for Phone Recognition II Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Prague, Czech Republic, 2011: 5060-5063. 被引量:1
  • 7Mohamed A, Dahl G E, Hinton G E. Acoustic Modeling Using Deep Belief Networks. IEEE Trans on Audio, Speech and Language Processing, 2012, 20 (1) : 14-22. 被引量:1
  • 8Dahl G, Yu Dong, Deng Li, et al, Context-Dependent Pre-Trained Deep Neural Networks for Large Vocabulary Speech Recognition. IEEE Trans on Audio, Speech and Language Processing, 2012, 20 (1) : 30-42. 被引量:1
  • 9Salakhutdinov R, Hinton G E. Learning Deep Generative Models. Ph. D Dissertation. Toronto, Canada: University of Toronto, 2011. 被引量:1
  • 10Kenny P.Joint Factor Analysis of Speaker and Session Variability[EB/OLJ.[2012-1O-20J. http://www.crim.calperso/ patrick. kenny. 被引量:1

同被引文献42

  • 1高慧,苏广川,陈善广.基于Teager能量算子(TEO)非线性特征的语音情绪识别[J].航天医学与医学工程,2005,18(6):427-431. 被引量:8
  • 2刘庆华.基于声门闭合瞬间检测的时延算法研究[J].电声技术,2006,30(9):45-49. 被引量:1
  • 3徐舜,陈绍荣,刘郁林.基于非线性时频掩蔽的语音盲分离方法[J].声学学报,2007,32(4):375-381. 被引量:9
  • 4Bengio Y.Learning Deep Architectures for AI[J].Foundations and Trends in Machine Learning,2009,2(1):1-127. 被引量:1
  • 5Dahl G E,Ranzato M,Mohamed A,et al.Phonerecognition with the Mean-covariance Restricted Boltzmann Machine[C]//Proceedings of the 24th Annual Conference on Neural Information Processing Systems.Berlin,Germany:Springer,2010:469-477. 被引量:1
  • 6Mohamed A,Dahl G E,Hinton G,et al.Acoustic Modeling Using Deep Belief Networks[J].IEEE Transactions on Audio,Speech and Language Processing,2012,20(1):14-22. 被引量:1
  • 7Salakhutdinov R,Hinton G.An Efficient Learning Procedure for Deep Boltzmann Machines[J].Neural Computation,2012,24(8):1967-2006. 被引量:1
  • 8Hinton G E,Osindero S,Teh Y W.A Fast Learning Algorithm for Deep Belief Nets[J].Neural Computation,2006,18(7):1527-1554. 被引量:1
  • 9Fischer A,Igel C.An Introduction to Restricted Boltzmann Machines[C]//Proceedings of Progress in Pattern Recognition,Image Analysis,Computer Vision,and Applications.Berlin,Germany:Springer,2012:14-36. 被引量:1
  • 10Mohamed A,Dahl G,Hinton G.Deep Belief Networks for Phone Recognition[C]//Proceedings of Workshop on Deep Learning for Speech Recognition and Related Applications.Berlin,Germany:Springer,2009. 被引量:1

引证文献5

二级引证文献41

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部