摘要
简述梅尔频率倒谱系数、线性预测系数、韵律学特征、共振峰频率和过零峰值幅度特征,并将这五种语音特征应用于情感语音识别.根据识别结果从PAD情绪模型的三个维度进行相关性分析得到特征的权重系数,并将识别结果融合映射到PAD三维情绪空间,最终获得情感语音的PAD值.利用情感语音的PAD值可以从连续情感理论对情感语音进行描述分析,采用量化的方法揭示情感空间中各种情绪范畴的定位和关系.
Five approaches of feature extraction: the MEL-frequency Cepstral Coefficient (MFCC), the Linear Predictor Coefficient(LPC), prosodic features, formant frequency and the Zero Crossings with Peak Amplitudes (ZCPA) are described in this paper. These features are applied to emotional speech recognition. According to the recognition results, the weight coefficients of features are obtained by correlation analysis in the three dimensions of PAD emotion model. Simultaneously the recognition results are fused to the PAD emotional space, and the PAD values of the emotional speech are obtained. The PAD values of the emotional speech can be analyzed from the theory of continuous emotion. And the quantitative analysis of emotional speech can reveal the position and relationship of emotional category in emotional space.
出处
《微电子学与计算机》
CSCD
北大核心
2016年第9期128-131,136,共5页
Microelectronics & Computer
基金
国家自然科学基金项目(61371193)
山西省青年科技研究基金项目(2013021016-2)
山西省回国留学人员科研资助项目(2013-034)
山西省研究生创新项目(2015BY24)
关键词
语音特征
情感语音识别
PAD情绪模型
相关性分析
speech feature
emotional speech recognition
PAD emotion model
correlation analysis