期刊文献+

基于EMD和交叉熵的语音端点检测算法 被引量:3

Speech endpoint detection based on EMD and cross-entropy
下载PDF
导出
摘要 针对复杂噪声环境下基于经验模态分解(EMD)的端点检测算法准确率低且不能自适应环境问题,提出了一种结合EMD和交叉熵的语音端点检测新算法。算法利用白噪声在各本征模态函数(IMF)中的概率分布是既定的且与幅值无关的EMD分解特性,将衡量语音帧与噪声帧概率分布差异性的交叉熵特征与EMD能量特征相结合,设置自更新检测阈值,实现复杂噪声环境下的语音端点检测。仿真实验证实了该方法在低信噪比以及非平稳噪声情况下具有显著的有效性和优越性。 In view of the problem that speech endpoint detection based on Empirical Mode Decomposition(EMD)losesits accuracy and adaptive in adverse environments, this paper proposes a novel speech endpoint detection algorithm basedon EMD and cross-entropy. EMD decomposition characteristic is analyzed that probability distribution of white noise ineach Intrinsic Mode Functions(IMF)is identified and unrelated to noise amplitude. Since probability distribution of whitenoise is different from that of speech signal, cross-entropy is used to reflect the difference of speech-frames andnoise-frames. EMD-energy feature and cross-entropy are complementary so that they are combined to be a comprehensivedetermination for speech endpoint detection. Adaptive threshold is set to adapt to negative environments. It catches thechanges of noise energy and then it is self-updated to improve accuracy in speech endpoint detection. Simulation resultsindicate that it is effective and superior in the presence of low Signal-to-Noise Ratio(SNR)and non-stationary noise.
作者 薛俊韬 翁玉茹 张军 XUE Juntao;WENG Yuru;ZHANG Jun(School of Electrical Engineering & Automation, Tianjin University, Tianjin 300072, China)
出处 《计算机工程与应用》 CSCD 北大核心 2016年第20期149-153,166,共6页 Computer Engineering and Applications
基金 天津市科技计划项目(No.13ZXCXGX40400 No.13ZXCXGX40500) 天津市滨海新区科技计划项目(No.2012-XJR21017)
关键词 端点检测 经验模态分解(EMD) 交叉熵 自适应门限 低信噪比 endpoint detection Empirical Mode Decomposition(EMD) cross entropy adaptive threshold low Signal-to-Noise Ratio(SNR)
  • 相关文献

参考文献15

  • 1蔡莲红等编著..现代语音技术基础与应用[M].北京:清华大学出版社,2003:367.
  • 2张君昌,姜菲,刘红.多特征相结合的带噪语音端点检测算法的研究[J].计算机工程与应用,2009,45(32):114-116. 被引量:13
  • 3Ghosh P K,Tsiartas A,Narayanan S.Robust voice activitydetection using long-term signal variability[J].IEEE Transactionson Audio,Speech,and Language Processing,2011,19(3):600-613. 被引量:1
  • 4Raj B,Singh R.Classifier-based non-linear projection for adaptiveendpointing of continuous speech[J].Computer Speechand Language,2003,17:5-26. 被引量:1
  • 5Suphattharachai C,Takao K.Incorporation of phrase intonationto context clustering for average voice models inHMM-based Thai speech synthesis[C].IEEE InternationalConference on Acoustics,Speech and Signal Processing,2008:4637-4640. 被引量:1
  • 6Li Qi,Zheng Jinsong,Tsai A,et al.Robust endpoint detectionand energy normalization for real-time speech andspeaker recognition[J].IEEE Trans on Speech and AudioProcessing,2002,10(3):146-157. 被引量:1
  • 7Wu Yadong,Li Yan.Robust speech/non-speech detectionin adverse conditions using the fuzzy polarity correlationmethod[C].Proc of IEEE International Conference onSystems,Man,and Cybernetics,2000:2935-2939. 被引量:1
  • 8Chen Shihuang,Wang Jhingfa.A wavelet-based voiceactivity detection algorithm in noisy environments[C].Proc of the 9th IEEE International Conference on Electronics,Circuits and Systems,2002:995-998. 被引量:1
  • 9Pan Nenghuang,Yu Mingshing,Wu Mingjer.A Mandarinintonation prediction model that can output real pitchpatterns[C].IEEE International Conference on Acoustics,Speech and Signal Processing,2003,1:496-499. 被引量:1
  • 10Huang N E,Shen Zheng,Long S R,et al.The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-station time series analysis[C].Proceedingsof the Royal Society of London,1998:903-995. 被引量:1

二级参考文献8

共引文献20

同被引文献22

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部