期刊文献+

采用多任务学习和循环神经网络的语音情感识别算法 被引量:18

Speech Emotion Recognition Algorithm Based on Multi Task Learning and Recurrent Neural Network
下载PDF
导出
摘要 随着机器学习的快速发展,许多研究者使用神经网络来解决语音识别领域中的各类问题。然而由于训练数据有限等原因,常规的神经网络分类器普遍存在泛化误差等问题。为了解决此问题,迁移学习中的多任务学习被引入到研究中。本文提出了一种采用多任务学习和循环神经网络的语音情感识别算法(MTL-RNN),将说话人情感识别作为主任务,性别识别和身份识别作为辅助任务,三个任务在神经网络中并行训练。算法模型通过RNN共享层共享网络参数、学习共享特征,通过属性依赖层学习独有特征,以提升模型的分类性能。实验结果表明,本文所提出的MTL-RNN算法在汉语和阿拉伯语、较少说话人和较多说话人的场景下均有较好的识别性能。 With the rapid development of machine learning,more and more researchers utilize neural networks to tackle multifarious issues existing in the domain of speech recognition.However,in virtue of various reasons like the restricted training data,most of conventional neural network classifiers are with the flaws such as generalization error and so on.In order to solve this problem,multi-task learning belonging to transfer learning has been studied actively nowadays.Based upon multi-task learning and cyclic neural network,this paper proposes a speech emotion recognition algorithm(MTL-RNN)which takes emotion recognition as the main task,gender and identity recognition as auxiliary tasks.On this basis,the three tasks are trained simultaneously in the neural network.Aiming at learning the sharing features and improving the classification performance of the model,the algorithm model shares network parameters through RNN sharing layers and studies unique features through the attribute-dependent layers.Experiments show that the MTL-RNN algorithm proposed in this paper has good recognition performance in the language environment of both Chinese and Arabic.Furthermore,it also works well not only in the experiment containing a few speakers but also in the other one with relatively more speakers.
作者 冯天艺 杨震 Feng Tianyi;Yang Zhen(Key Lab of Broadband Wireless Communication and Sensor Network Technology,Ministry of Education,Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China;National Local Joint Engineering Research Center for Communications and Network Technology,Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China)
出处 《信号处理》 CSCD 北大核心 2019年第7期1133-1140,共8页 Journal of Signal Processing
基金 国家“863”高技术研究发展计划项目(2006AA010102)
关键词 语音情感识别 多任务学习 循环神经网络 speech emotion recognition multi-task learning recurrent neural network
  • 相关文献

参考文献4

二级参考文献130

  • 1赵力,王治平,卢韦,邹采荣,吴镇扬.全局和时序结构特征并用的语音信号情感特征识别方法[J].自动化学报,2004,30(3):423-429. 被引量:15
  • 2王治平,赵力,邹采荣.基于基音参数规整及统计分布模型距离的语音情感识别[J].声学学报,2006,31(1):28-34. 被引量:26
  • 3Picard R W. Affective computing[M]. Cambridge: MIT Press, 1997. 被引量:1
  • 4Picard R W. Toward computers that recognize and respond to user emotion[J]. IBM Technical Journal, 2000, 38(2): 705-719. 被引量:1
  • 5Scherer K R, Banziger T. Emotional expression in prosody: A review and an agenda for future research [C]//SP2004(Speech Prosody 2004). Nara, Japan: International Speech Communication Association, 2004:355-369. 被引量:1
  • 6Arnold M. Emotion and personality[J]. Psychologi- cal Aspects, 1960,1 : 11-116. 被引量:1
  • 7Tomkins A S S. The negative affects[J]. Affect, Imagery, Consciousness, 1962,2 : 111-116. 被引量:1
  • 8vMurray I, Amott J L. Towards the simulation of e motion in synthetic speech: A review of the literature on human vocal emotion[J]. Journal of the Acoustic Society of America, 1993,93(2) : 1097-1108. 被引量:1
  • 9Ortony A, Turner T J. Whatrs basic about basic e- motions[J]. Psychological Review, 1990, 97 (3): 315-331. 被引量:1
  • 10Stibbard R M. Vocal expression of emotions in mon laboratory speech: An investigation of the reading/ leeds emotion in speech porject annotation data[D]. UK: University of Reading,2001. 被引量:1

共引文献184

同被引文献116

引证文献18

二级引证文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部