期刊文献+

基于深层信息散度最大化的说话人确认方法 被引量:2

Speaker verification method based on deep information divergence maximization
下载PDF
导出
摘要 针对说话人确认中无法准确捕获特征间非线性关系的问题,提出了一种基于深层信息散度最大化的目标函数表示方法。该方法能通过计算特征所在分布之间相似度,来对特征间的非线性关系进行隐性表示,并在最大化这种统计相关性的优化目标指导下,使深度神经网络向着同类数据更紧凑、异类数据更分散的方向优化,最终达到提升深层特征空间区分性的目标。实验结果表明,相对于其他深度学习方法,所提方法的相对等错误率(EER)最多降低了15.80%,显著提升了系统性能。 To solve the problem that the nonlinear relationship between speaker representations cannot be accurately captured in speaker verification,an objective function based on depth information divergence maximization was proposed.It could implicitly represent the nonlinear relationship between speaker representations by calculating the similarity be-tween their distributions.Under the supervision of the optimization goal of maximizing the statistical correlation,the deep neural network was optimized towards the direction that the within-class data was more compact and the be-tween-class data were far away from each other,and finally the discrimination of deep speaker representation space could be effectively improved.Experimental results show that compared with other deep learning methods,the relative EER of the proposed method is reduced by 15.80%at most,which significantly improves the system performance.
作者 陈晨 肜娅峰 季超群 陈德运 何勇军 CHEN Chen;RONG Yafeng;JI Chaoqun;CHEN Deyun;HE Yongjun(School of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China;Postdoctoral Research Station of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China)
出处 《通信学报》 EI CSCD 北大核心 2021年第7期231-237,共7页 Journal on Communications
基金 国家自然科学基金资助项目(No.61673142) 黑龙江省自然科学基金资助项目(No.JJ2019JQ0013) 黑龙江省博士后专项基金资助项目(No.LBH-Z20020) 黑龙江省普通高校基本科研业务费专项资金资助项目(No.2020-KYYWF-0341)。
关键词 说话人确认 目标函数 深层信息散度 特征表示学习 speaker verification objective function deep information divergence representation learning
  • 相关文献

参考文献6

二级参考文献24

  • 1FURUI S. Cepstral analysis technique for automatic speaker verification[J]. IEEE Trans on Acoustic, Speech and Signal Processing, 1981, 29(2): 254-272. 被引量:1
  • 2GONG Y E. Speech recognition in noisy environments: a survey[J]. Speech Communication, 1995, 16: 261-291. 被引量:1
  • 3HERMANSKY H, MORGAN N, HIRSCH H. Recognition of speech in additive and convolutional noise based on RASTA spectral processing[A]. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing[C]. 1993.83-86. 被引量:1
  • 4QUATIERI T E REYNOLDS D A, O'LEARY G C. Estimation of handset nonlinearity with application to speaker recognition[J]. IEEE Transcation on Speech and Audio Processing, 2000, 8(5): 567-583. 被引量:1
  • 5SUN S C, JI L X. Design keyword recognition system over telephone channel based on muti-band processing[A]. The Second IEEE Conference on Industrial Electronics and Applications[C]. 2007.2235- 2238. 被引量:1
  • 6JUNANG B H, RABINER L R, WILPON J G. On the use of bandpass filtering in speech recognition[J]. IEEE Transctions on Acoustic, Speech and Signal Processing, 1987, 35: 871-879. 被引量:1
  • 7ZHOU X, FU Y, LIU M. Robust analysis and weighting on MFCC components for speech recognition and speaker identification[A]. ICME 2007[C]. 2007. 188-191. 被引量:1
  • 8GALES M, YOUNG S. HMM recognition in noise using parallel model combination[A]. Proc of Eurospeech-93 [C]. 1993.342-346. 被引量:1
  • 9GALES M, YOUNG S. Robust continuous speech recognition using parallel model combination[A]. Transactions on Speech and Audio Processing[C]. 1996. 352-359. 被引量:1
  • 10MORENO P J, RAJ B, STERN R M. A vector taylor approach for environment independent speech recognition[A]. Proc ICASSP[C]. New York, 1996.733-736. 被引量:1

共引文献217

同被引文献3

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部