期刊文献+

基于知识蒸馏与ResNet的声纹识别 被引量:2

Voiceprint recognition based on knowledge distillation and ResNet
下载PDF
导出
摘要 针对声纹识别领域中存在信道失配与对短语音或噪声条件下声纹特征获取不完全的问题,提出一种将传统方法与深度学习相结合,以I-Vector模型作为教师模型对学生模型ResNet进行知识蒸馏。构建基于度量学习的ResNet网络,引入注意力统计池化层,捕获并强调声纹特征的重要信息,提高声纹特征的可区分性。设计联合训练损失函数,将均方根误差(MSE,mean square error)与基于度量学习的损失相结合,降低计算复杂度,增强模型学习能力。最后,利用训练完成的模型进行声纹识别测试,并与多种深度学习方法下的声纹识别模型比较,等错误率(EER,equal error rate)至少降低了8%,等错误率达到了3.229%,表明该模型能够更有效地进行声纹识别。 Aiming at the problem of channel mismatch in the field of voiceprint recognition and incomplete acquisition of voiceprint features under short speech or noise conditions,a method that combines traditional methods with deep learning is proposed,and the ResNet model is used as the student model to perform knowledge distillation on the I-Vector model as the teacher model.We construct a ResNet network based on metric learning,introduce an attentive statistics pooling layer,capture and emphasize the important information of voiceprint features,and improve the distinguishability of voiceprint features.The mean square error(MSE)is combined with the loss based on metric learning to reduce computational complexity and enhance model learning capabilities.Finally,the trained model is used for voiceprint recognition test,and compared with the voiceprint recognition model under a variety of deep learning methods.It’s found that the equal error rate(EER)is reduced by at least 8%,and the equal error rate has reached 3.229%,indicating that the model can perform speaker verification more effectively.
作者 荣玉军 方昳凡 田鹏 程家伟 RONG Yujun;FANG Yifan;TIAN Peng;CHENG Jiawei(China Mobile Hangzhou Information Technology Co.Ltd.,Hangzhou 310000,P.R.China;Chongqing University Posts&Telecommunication,College Automation,Chongqing 400065,P.R.China)
出处 《重庆大学学报》 CAS CSCD 北大核心 2023年第1期113-124,共12页 Journal of Chongqing University
基金 教育部-中国移动科研基金资助项目(MCM20180404) 国家自然科学基金(52272388)。
关键词 深度学习 知识蒸馏 声纹识别 说话人识别 deep learning knowledge distillation voiceprint recognition speaker verification
  • 相关文献

参考文献6

二级参考文献83

  • 1邬向前,王宽全,张大鹏.一种用于掌纹识别的线特征表示和匹配方法(英文)[J].软件学报,2004,15(6):869-880. 被引量:28
  • 2[1]Glossary of biometrics terms [R].1998,Association for biometrics(AfB),Intemational Computer Security Association (ICSA). 被引量:1
  • 3[2]R Chellappa,et al.Humnan and machine recognition of face:a survey[J].Proc.IEEE,1995,83 (5):705-740. 被引量:1
  • 4[3]R Brunelli,T Poggio.Face recognition:features versus templates [J].IEEE Trans.PAMI,1993,15(10):1042-1052. 被引量:1
  • 5[4]D L Swets,J Weng.Using discriminant eigenfeatures for image retrieval[J].IEEE Trans.PAMI,1996,18 (8):831-836. 被引量:1
  • 6[5]B Moghaddam,et al.Probabilistic visual recognition for object recognition [J].IEEE Trans.PAMI,1997,19(7) :696-710. 被引量:1
  • 7[6]S Y Lee,et al.Recognition of humman front faces using knowledgebased feature extraction and neunofuzzy algorithm [J].Pattern Recognition,1996,29(11):1863-1876. 被引量:1
  • 8[7]S Lawtonce,et al.Face recognition:a convolutional neural-network approach [J].IEEE Trans.NN,1997,8(1):98-113. 被引量:1
  • 9[9]J Zhang,et al.Face recognition:eigenface,elastic matching,and neural nets [J].Proc.IEEE,1997,85(9):1422-1435. 被引量:1
  • 10[10]L Wiskott,et al.Face recognition by elastic bunch graph matching [J].IEEE Trans.PAMI.1997,19(6) :775-779. 被引量:1

共引文献264

同被引文献33

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部