期刊文献+

基于卷积神经网络的孤立词语音识别 被引量:20

Speech recognition of isolated words based on convolution neural networks
下载PDF
导出
摘要 为有效减少模型训练参数和训练时间,提高孤立词语音识别正确率,提出将卷积神经网络应用到语音识别中的方法。该网络中的局部感知野、权值共享与池化等特殊结构,能够在保证识别性能的前提下,极大地压缩训练模型的尺寸,深入分析卷积层卷积器个数与尺寸和池化层池化参数对识别结果的影响情况;经过动态时间规整网络,将发音单元不同长度帧的特征参数规整到同一帧数,输入到网络中进行语音识别。在自建库上的实验结果表明,相比传统的深度神经网络,卷积神经网络的语音识别正确率有12%的提升,是一种优良的语音识别模型。 To reduce the model training parameters and training time effectively and to improve the speech recognition rate of isolated words, convolutional neural network was proposed to apply to speech recognition. The special structure of local perception field, weight sharing and pooling in the network greatly reduced the size of the training model on the premise of ensuring the re- cognition performance, and the influence of the number and size of convolver of convolutional layers and the pooling parameters of pooling layers on the recognition results were deeply analyzed. After the dynamic time warping network, the characteristic parameters of different length frames of the pronunciation unit were normalized to the same number of frames and were input into the network for speech recognition. Experimental results on self-built databases show that compared with the traditional deep neural network, the accuracy of speech recognition of convolutional neural networks is improved by 12%, which is an excellent speech recognition model.
作者 侯一民 李永平 HOU Yi-min;LI Yong-ping(School of Automation Engineering,Northeast Electric Power University,Jilin 132012,China)
出处 《计算机工程与设计》 北大核心 2019年第6期1751-1756,共6页 Computer Engineering and Design
基金 吉林省科技发展计划基金项目(20150414051GH)
关键词 卷积神经网络 语音识别 局部感知野 权值共享 池化 convolutional neural networks speech recognition local perception weight sharing pooling
  • 相关文献

参考文献8

二级参考文献40

  • 1陈国良,韩文廷.人工神经网络理论研究进展[J].电子学报,1996,24(2):70-75. 被引量:20
  • 2孙宁,孙劲光,孙宇.基于神经网络的语音识别技术研究[J].计算机与数字工程,2006,34(3):58-61. 被引量:9
  • 3胡晓林,朱军.深度学习——机器学习领域的新热点[J].中国计算机学会通讯,2013,9(7):64—69. 被引量:2
  • 4Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J] . Science, 2006, 313(5786):504-507. 被引量:1
  • 5Hinton G E, Osindero S, Teh Y. A fast learning algorithm for deep belief nets[J] . Neural Computation, 2006, 18(7):1527-1554. 被引量:1
  • 6Mohamed A R, Dahl G E, Hinton G E. Deep belief networks for phone recognition[C] //Proc of NIPS Workshop on Deep Learning for Speech Recognition. 2009:230-239. 被引量:1
  • 7Coates A, Ng A Y, Lee H. An analysis of single-layer networks in unsupervised feature learning[C] //Proc of International Conference on Artificial Intelligence and Statistics. 2011:215-223. 被引量:1
  • 8Hinton G E, Deng Li, Yu Dong, et al. Deep neural networks for acoustic modeling in speech recognition:the shared views of four research groups[J] . IEEE Signal Processing Magazine, 2012, 29(6):82-97. 被引量:1
  • 9那斯尔江·吐尔逊,吾守尔·斯拉木.基于HMM的维吾尔语连续语音识别系统[D].乌鲁木齐:新疆大学,2008:272-278. 被引量:1
  • 10Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, et al. Unsaper- vised feature learning and deep learning [R]. deeplearning. stanford, edu/wiki/inde php, 2013. 被引量:1

共引文献237

同被引文献208

引证文献20

二级引证文献76

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部