摘要
提出了全局谱参数下的耳语说话人状态因子分析方法。首先,根据耳语听辨实验结果,提出导入唤醒度-愉悦度因子对说话人状态进行三级度量;其次,提取耳语音正弦模型、人耳听觉模型下的谱参数,结合其他短时频谱参量,进行轨迹跟踪并计算各参数的全局统计变量,作为特征参数来实现耳语说话人状态的分类。实验结果显示,正弦模型及人耳听觉模型的全局谱参数可将耳语说话人状态因子分类系统的准确率提高至90%。该分类方法及状态因子描述方案提供了耳语音说话人状态分析的有效途径。
Speaker factor analysis of whispered speech from global spectral features is proposed. According to the perceptual experiments, the Arousal-Valance factor is imported to determine the speaker's state. The spectral parameters from the Sinusoidal Model and Auditory Model, in addition to the Short-term Spectral Features, are abstracted and tracked. The global statistics from all of the variables mentioned above are calculated to identify the speaker's sentiment of whispered speech. The experimental results indicate that the accuracy of this system reaches to 90%. This classification method and speaker factor description scheme offer an effective path to state analysis of whispered speaker.
出处
《声学学报》
EI
CSCD
北大核心
2014年第2期281-288,共8页
Acta Acustica
基金
国家自然科学基金(61071215
61271359
61372146)
江苏省普通高校研究生科研创新计划项目(05KJB510113)资助