摘要
关键词检测是从连续语音流中检测预先定义的给定词的技术,是语音识别领域的一个重要应用。目前的关键词检测研究中,主流的方法是基于连续语音识别器的先识别后检测的两阶段方法,语音识别器的准确率对关键词检测有很大影响。本文首先在识别阶段引入深度学习技术来改善关键词检测算法的性能。进而针对识别阶段和检测阶段缺乏紧密联系,耦合度不够的问题,研究了侧重关键词的深度神经网络声学建模技术,利用非均匀的最小分类错误准则来调整深度神经网络声学建模中的参数,并利用Ada Boost算法来动态调整声学建模中的关键词权重。结果表明,利用非均匀最小分类错误准则来调整深度神经网络参数进行优化的声学模型,可以提高关键词检测的性能。
Spoken term detection( STD) is a task to automatically detect a set of keywords in continuous speech,which is an important field of speech recognition. Current study is based on two- stage approach i. e. recognition and detection. The accuracy of speech recognition has a significant impact on keyword detection. Firstly,this paper uses deep leaning techniques to improve performance during the first stage. As the two stages lack of close contact,the paper studies using non-uniform misclassification error( MCE) criteria to adjust the parameters in deep neural network based acoustic modeling.Further the paper uses the adaptive boosting( Ada Boost) strategy to adjust keywords' weight dynamically. It shows that non- uniform MCE can improve the performance of STD.
出处
《智能计算机与应用》
2015年第5期15-17,21,共4页
Intelligent Computer and Applications
基金
国家自然科学基金(91120303)