期刊文献+

基于非均匀MCE准则的DNN关键词检测系统中声学模型的训练 被引量:1

Non-uniform MCE Based Acoustic Model for Keyword Spotting based on Deep Neural Network
下载PDF
导出
摘要 关键词检测是从连续语音流中检测预先定义的给定词的技术,是语音识别领域的一个重要应用。目前的关键词检测研究中,主流的方法是基于连续语音识别器的先识别后检测的两阶段方法,语音识别器的准确率对关键词检测有很大影响。本文首先在识别阶段引入深度学习技术来改善关键词检测算法的性能。进而针对识别阶段和检测阶段缺乏紧密联系,耦合度不够的问题,研究了侧重关键词的深度神经网络声学建模技术,利用非均匀的最小分类错误准则来调整深度神经网络声学建模中的参数,并利用Ada Boost算法来动态调整声学建模中的关键词权重。结果表明,利用非均匀最小分类错误准则来调整深度神经网络参数进行优化的声学模型,可以提高关键词检测的性能。 Spoken term detection( STD) is a task to automatically detect a set of keywords in continuous speech,which is an important field of speech recognition. Current study is based on two- stage approach i. e. recognition and detection. The accuracy of speech recognition has a significant impact on keyword detection. Firstly,this paper uses deep leaning techniques to improve performance during the first stage. As the two stages lack of close contact,the paper studies using non-uniform misclassification error( MCE) criteria to adjust the parameters in deep neural network based acoustic modeling.Further the paper uses the adaptive boosting( Ada Boost) strategy to adjust keywords' weight dynamically. It shows that non- uniform MCE can improve the performance of STD.
出处 《智能计算机与应用》 2015年第5期15-17,21,共4页 Intelligent Computer and Applications
基金 国家自然科学基金(91120303)
关键词 深度学习 关键词检测 ADA BOOST 最小分类错误 Deep Learning Spoken Term Detection Ada Boost Minimum Classification Error
  • 相关文献

参考文献16

  • 1HINTON G, DENG L, YU D, et al. Deep Neural Networks for acoustic modeling in Speech Recognition: The shared views of four research groups[J]. Signal Processing Magazine IEEE, 2012, 29(6): 82 - 97. 被引量:1
  • 2MILLER D, KLEBER M, KAO C, et al. Rapid and accurate spoken tenn detection[J]. Proc. Interspeech, 2007 , 3: 1965 - 1968. 被引量:1
  • 3National Institute of Standards and Technology (NIST). The spoken term detection (STD) 2006 evaluation plan[J]. http://www. nist. gov / speechl testsl std ,2006. 被引量:1
  • 4JUANG B, HOU W, LEE C. Minimum classification error rate methods for speech recognition[J]. IEEE Trans on Speech & Audio Proc, 1997,5(3) :257 - 265. 被引量:1
  • 5BAHL L, BROWN P F, De SOUZA P V, et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition[J]. Acoustics Speech & Signal Processing IEEE International Conference on Icassp, 1986, 11 :49 - 52. 被引量:1
  • 6DANIEL P. Discriminative training for large vocabulary speech recognition[D]. Cambridge: University of Cambridge, 2003. 被引量:1
  • 7FU Q, MANSJUR D S,JUANG B H. Non - Uniform error criteria for automatic pattern and speech recognition[C] / / Acoustics, Speech and Signal Processing, 2008. ICASSP 2008, IEEE International Conference on, Las Vegas: IEEE, 2008:1853 - 1856. 被引量:1
  • 8FU Q, MANSJUR D S,JUANG B. Empirical system learning for statistical pattern recognition With non - uniform error criteria[J]. Signal Processing IEEE Transactions on, 2010, 58(9) :4621 - 4633. 被引量:1
  • 9WENG C,JUANG B. Adaptive boosted non - uniform MCE for keyword spotting on spontaneous speech[C] / /IEEE International Conference on Acoustics, Speech & Signal Processing, Vancouver: IEEE, 2013: 6960 - 6964. 被引量:1
  • 10GHOSHAL A, POVEY D. Sequence discriminative training of deep neural networks[J]. Proclnterspeech, 2013 ( 8) : 2345 - 2349. 被引量:1

同被引文献1

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部