基于基音跟踪的语音增强研究

Research on Speech Enhancement based on Pitch Tracking

下载PDF

导出

摘要在移动通信、语音识别、基于语音的语音交互等领域,采集的语音信号往往混杂具有谐波结构的噪声,因此语音增强都有非常重要的应用价值。语音的能量大部分集中在浊音段,浊音具有谐波结构。基于实际混合声音在时频域具有近似稀疏性特点,提出一种基于基音跟踪的语音增强算法,利用基音特征尽可能地恢复语音的谐波结构同时抑制噪声信号能量来达到提升语音信噪比的目的。首先对混合声音流进行切分、浊音段提取,接着对浊音段信号进行多基频提取,并利用维特比解码找出主导基频,使用BP神经网络对主导基频进行是否人声基频的判别,最后利用梳齿滤波器重构浊音段语音或抑制干扰音。仿真实验表明,算法能够从混有音乐和背景噪声的混合音频中提取语音,语音信噪比增益平均达8 dB。 In the fields of mobile communication,speech recognition and voice-based voice interaction,etc.,the collected speech signals are often mixed with noise with harmonic structure,so speech enhancement has very important application value.Most of the speech energy is concentrated in the voiced segment,and the voiced speech has a harmonic structure.Based on the fact that the actual mixed-sound shows approximate sparse characteristics in time-frequency domain,this paper proposes a speech enhancement algorithm based on pitch tracking,which use the pitch feature to restore the harmonic structure of the speech as much as possible while suppressing the noise signal energy to achieve the purpose of improving the speech signal to noise ratio.Firstly,the mixed sound stream issegmented and the voiced segmentis extracted.Then,the multi-pitch extraction is performed on the voiced segment signal.The dominant pitch is found through Viterbi decoding,and the BP neural network is used to discriminate whether the dominant pitch is vocal pitch.Lastly,The comb-tooth filter is used to reconstruct the speech in the voiced segment or to suppress the interference.The experimental results showed that the algorithm successes extracting speech from mixed-audio which is mixed with music and background noise,and the ratio of speech signal to noise gains 8 dB in average.

作者蔡良夏秀渝陆雄孙文慧 CAI Liang;XIA Xiuyu;LU Xiong;SUN Wenhui(College of Electronic and Information Engineering,Sichuan University,Chengdu 610065,China)

机构地区四川大学电子信息学院

出处《成都信息工程大学学报》 2019年第1期1-6,共6页 Journal of Chengdu University of Information Technology

关键词语音增强维特比算法基音跟踪多基频提取 speech enhancement viterbi algorithm pitch tracking multi-pitch extraction

分类号 TN912.35 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献8

1王红..低信噪比场景下语音增强算法的研究[D].安徽大学,2017:
2胡定禹,郁文贤,江文斌.基于谐波重建的语音增强算法的研究[J].信息技术,2017,41(11):112-116. 被引量：1
3宋知用编著..MATLAB在语音信号分析与合成中的应用[M].北京:北京航空航天大学出版社,2013:386.
4孙彦楠,夏秀渝.基于深度神经网络的关键词识别系统[J].计算机系统应用,2018,27(5):41-48. 被引量：7
5韩纪庆,郑铁然,郑贵滨编著..音频信息检索理论与技术[M].北京:科学出版社,2011:244.
6吕菲,夏秀渝.基于方位特征的听觉选择性注意计算模型研究[J].自动化学报,2017,43(4):634-644. 被引量：5
7夏秀渝,何培宇.基于声源方位信息和非线性时频掩蔽的语音盲提取算法[J].声学学报,2013,38(2):224-230. 被引量：10
8王雨,林家骏,袁文浩,陈宁.基于改进基音跟踪算法的单通道语音分离[J].华东理工大学学报（自然科学版）,2013,39(3):338-344. 被引量：4

二级参考文献25

1谢志文,尹俊勋,饶丹.空间掩蔽效应的实验研究[J].声学学报,2006,31(4):363-369. 被引量：10
2HE Zhaoshui XIE Shengli FU Yu.Sparse representation and blind source separation of ill-posed mixtures[J].Science in China(Series F),2006,49(5):639-652. 被引量：24
3徐舜,陈绍荣,刘郁林.基于非线性时频掩蔽的语音盲分离方法[J].声学学报,2007,32(4):375-381. 被引量：9
4Wang Deliang, Brown G J. Computational Auditory Scene Analysis [M]. USA: IEEE Press ,2006. 被引量：1
5Hu Guoning, Wang Deliang. An Auditory Scene Analysis Approach to Monaural Speech Segregation[M]//Topics in Acoustic Echo and Noise Control. Berlin Heidelberg: Spring er, 2006 : 485-515. 被引量：1
6Hu Guoning, Wang Deliang. Monaural speech segregation based on pitch tracking and amplitude modulation [J]. IEEE Transactions on Neural Networks, 2004, 15(5):1135-1149. 被引量：1
7Hu Guoning. Monaural speech organization and segregation [D]. USA: The Ohio State University,2006. 被引量：1
8Meddis R. Simulation of auditory neural transduction: Fur- ther studies[J]. Journal of the Acoustical Society of America, 1988,88(3) :1056-1063. 被引量：1
9Wang Deliang, Hu Guoning. Unvoiced speech segregation [C]//IEEE International Conference on Acoustics, Speech and Signal Processing. USA: IEEE, 2006 : 953-956. 被引量：1
10Hu Guoning, Wang Deliang. Segregation of unvoiced speech from non speech interference [J]. Journal of the Acoustical Society of America, 2008, 124 : 1306-1319. 被引量：1

共引文献22

1张凤仪,夏秀渝,冉国敬,何礼,叶于林.多声源环境下的鲁棒说话人识别[J].计算机系统应用,2015,24(4):32-37. 被引量：1
2钱思冲,向阳,李恒,李胜杨,施雨骁,李瑞.基于计算听觉场景分析的内燃机噪声源分离方法[J].内燃机学报,2015,33(1):63-70. 被引量：1
3钱思冲,向阳,李胜杨,李恒.基于独立分量分析与二值掩膜的语音分离[J].华中科技大学学报（自然科学版）,2015,43(7):87-92. 被引量：2
4孟宗,马钊,刘东,李晶.基于小波半软阈值消噪的盲源分离方法[J].中国机械工程,2016,27(3):337-342. 被引量：10
5李然军,李辉,李冬冬.改进听觉组织方法的单声道浊语音分离[J].小型微型计算机系统,2016,37(3):637-640.
6叶于林,莫建华,刘夏.多说话人环境下目标说话人语音提取方案[J].计算机系统应用,2016,25(4):8-15. 被引量：1
7刘镇,吕超,范远超.基于深度学习的多声源并行化声纹辨别方法[J].江苏科技大学学报（自然科学版）,2018,32(1):106-111. 被引量：6
8王凯龙,张二华,曹冠彬.基于计算听觉场景分析的单通道信噪分离方法[J].计算机与数字工程,2019,47(5):1049-1054. 被引量：1
9米婧.英语语音优化识别建模仿真分析[J].信息技术,2019,43(6):91-95. 被引量：6
10陈太波,张翠芳.多特征和SVM改进的语音关键词识别系统[J].小型微型计算机系统,2019,40(11):2291-2296. 被引量：7

1文仕学,孙磊,杜俊.渐进学习语音增强方法在语音识别中的应用[J].小型微型计算机系统,2018,39(1):1-6. 被引量：5
2林琴,夏俊峰,涂铮铮,郭玉堂.基于帧特征及维特比解码的手写体与印刷体分类[J].激光与光电子学进展,2019,56(6):115-121. 被引量：4
3袁文浩,娄迎曦,夏斌,孙文珠.基于卷积门控循环神经网络的语音增强方法[J].华中科技大学学报（自然科学版）,2019,47(4):13-18. 被引量：9
4刘柏亨,原松梅.基于VR的数字博物馆中语音交互设计探究[J].智能计算机与应用,2019,9(3):232-236. 被引量：5
5朱舒雅,倪彬彬,顾旭东.基于小波变换的语音去噪阈值函数的研究[J].河北北方学院学报（自然科学版）,2017,33(9):29-33. 被引量：1
6牛童立,焦希宁,武鑫.一种语音交互机器人系统研发[J].信息周刊,2019,0(8):0072-0072.
7王宁,王永.基于模糊不确定观测器的四旋翼飞行器自适应动态面轨迹跟踪控制[J].自动化学报,2018,44(4):685-695. 被引量：21
8黄升.智能语音产品的交互设计研究[J].包装世界,2019,0(1):173-173. 被引量：1
9李敏.黑龙江联通LTE网络精准规划工程评估方案[J].通信管理与技术,2019(1):50-51.
10赵一勤,曹嘉欣,刘靖禹.基于语音语调的抑郁症检测软件[J].电脑编程技巧与维护,2019(5):37-39. 被引量：1

成都信息工程大学学报

2019年第1期

浏览历史

内容加载中请稍等...

基于基音跟踪的语音增强研究

参考文献8

二级参考文献25

共引文献22

相关作者

相关机构

相关主题

浏览历史