摘要
针对现有语音情感特征在表征情感信息上的不完整,将相空间重构理论引入到情感语音的特征提取中.通过分析不同语音情感状态下相空间重构的几何特性,提取了该重构相空间下基于轨迹的描述轮廓的5种非线性几何特征作为新的情感语音特征参数,并根据情感与特征映射的关系提出一种特征参数优化方法.首先,选用德语柏林语音库中的高兴、悲伤、中性和生气4种情感作为实验样本;其次,提取非线性几何特征和非线性属性特征(最小延迟时间、关联维数、Kolmogorov熵、最大Lyapunov指数和Hurst指数);最后,根据设计方案采用支持向量机进行情感语音识别.实验结果表明,该特征相较于非线性属性特征在情感语音识别上有较强的优势度,联合非线性属性特征后,通过特征参数优化的方法获得了最优的非线性特征集合,验证了该方法的实用性.
In view of the imperfection of the existing speech emotional characteristics in the representation of emotional information,this paper introduces phase space reconstruction theory into the feature extraction of emotional speech.By analyzing the geometrical characteristics of phase space reconstruction under different speech emotion states,five nonlinear geometric features of trajectory-based descriptive contours under the reconstructed phase space are extracted as the new emotional speech characteristic parameters,and a novel feature parameter optimization method based on the relationship of emotional speech feature mapping is proposed.First,experience uses four basic emotions of happy,sad,neutral and angry in the German Berlin voice library as a sample.Second,the nonlinear geometric features and nonlinear attribute features(Minimum delay time,dimension correlation,Kolmogorov entropy,and Maximum Lyapunov exponent and Hurst exponent)are extracted from the emotional speech signal.Finally,a linear support vector machine(SVM)is employed to classify emotional speech signals according to the design scheme.The results show that the nonlinear geometric features have a strong dominance in the emotional speech recognition compared with the nonlinear attribute,and that the method of feature parameter optimization can obtain the optimal nonlinear feature set when nonlinear geometric features are combined with nonlinear attribute features,which verifies the practicability of the method.
出处
《西安电子科技大学学报》
EI
CAS
CSCD
北大核心
2017年第6期162-168,共7页
Journal of Xidian University
基金
国家自然科学基金资助项目(61371193)
关键词
相空间重构
非线性几何特征
特征参数优化
语音情感识别
phase space reconstruction
nonlinear geometric features
feature parameter optimization
speech emotion recognition