摘要
为了更有效地提取英语句子重音,提出了一种基于听感知特征的方法。根据音素特点,改进段长的归一化方法;根据听感知特性,引入半音程和响度特征,并以归一化的音节最高值代替其平均值,系统正确率达到78.7%,漏检率为9.37%。在此基础上,还提出了基于掩蔽效应的突显度模型,系统正确率提高到83.4%,漏检率下降到5.72%。实验表明,突显度模型更符合人的听感知,其性能接近人工标注的一致率(约为86%)。系统还具有文本无关和说话人无关的优点。
This paper presents a method to effectively detect the sentence accent using human auditory characteristics. The system improves the calculation of normalized duration according to the phoneme characteristics,and imports the syllable maximum normalized value of loudness and semitone,which matches the auditory. The correct rate reached 78.7%,with a deletion rate of 9.37%. The prominence model was developed based on the masking effect with the accuracy rate increasing to 83.4%,with a deletion rate of 5.72%. Experiments demonstrate that the prominence model better matches the auditory and the system's performance approaches the inter-human agreement (about 86%). The method also has the advantages of text independence and speaker independence.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2010年第4期613-617,共5页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目(60776800)
国家"八六三"高技术项目(2006AA010101
2007AA04Z223
2008AA02Z414)
关键词
语音信号处理
句子重音
听感知
突显度模型
段长
响度
半音程
speech signal processing
sentence accent
auditory
prominence model
duration
loudness
semitone