摘要
语言模型是语音识别系统的重要组成部分,目前的主流是n-gram模型。然而n-gram模型存在一些不足,对语句中长距信息描述差、数据稀疏是影响模型性能的两个重要因素。针对不足,研究者提出循环神经网络(Recurrent Neural Network,RNN)建模技术,在英语语言模型建模上取得了较好的效果。根据汉语特点将RNN建模方法应用于汉语语言建模,并结合两种模型的优点,提出了模型融合构建方法。实验结果表明:相比传统的n-gram语言模型,采用RNN训练的汉语语言模型困惑度(Per PLexity,PPL)有了下降,在对汉语电话信道的语音识别上,系统错误率也有下降,将两种语言模型融合后,系统识别错误率更低。
Language model is an important part in the speech recognition system, the current mainstream technique is n-gram model. However, n-gram language model still has some shortcomings: the first is poorly to describe the long-distance information of a sentence, and the second is to arise the serious data sparse phenomenon; essentially they are the two important factors influencing the performances of the model. Aiming at these defects of n-gram language model, the researchers put forward a recurrent neural network(RNN) modeling technique, with which, the training for the English language model has achieved good results. According to the characteristics of the Chinese language, the RNN method is used for training the Chinese language model; also a model combination method to combine the advantages of the two models is proposed. The experimental results show that: the perplexity of RNN model has a certain decline, there is also a certain decline on the system recognition error rate,and after model combination, the recognition error rate reduces much more on the Chinese phone speech recognition, compared with the n-gram language model.
出处
《声学技术》
CSCD
北大核心
2015年第5期431-436,共6页
Technical Acoustics
基金
国家自然科学基金(60872113)
安徽省自然科学基金(1208085MF94
1308085QF99)资助项目
关键词
语音识别
循环神经网络
语言模型
模型融合
speech recognition
recurrent neural network
language model
model combination