摘要
在移动通信、语音识别、基于语音的语音交互等领域,采集的语音信号往往混杂具有谐波结构的噪声,因此语音增强都有非常重要的应用价值。语音的能量大部分集中在浊音段,浊音具有谐波结构。基于实际混合声音在时频域具有近似稀疏性特点,提出一种基于基音跟踪的语音增强算法,利用基音特征尽可能地恢复语音的谐波结构同时抑制噪声信号能量来达到提升语音信噪比的目的。首先对混合声音流进行切分、浊音段提取,接着对浊音段信号进行多基频提取,并利用维特比解码找出主导基频,使用BP神经网络对主导基频进行是否人声基频的判别,最后利用梳齿滤波器重构浊音段语音或抑制干扰音。仿真实验表明,算法能够从混有音乐和背景噪声的混合音频中提取语音,语音信噪比增益平均达8 dB。
In the fields of mobile communication,speech recognition and voice-based voice interaction,etc.,the collected speech signals are often mixed with noise with harmonic structure,so speech enhancement has very important application value.Most of the speech energy is concentrated in the voiced segment,and the voiced speech has a harmonic structure.Based on the fact that the actual mixed-sound shows approximate sparse characteristics in time-frequency domain,this paper proposes a speech enhancement algorithm based on pitch tracking,which use the pitch feature to restore the harmonic structure of the speech as much as possible while suppressing the noise signal energy to achieve the purpose of improving the speech signal to noise ratio.Firstly,the mixed sound stream issegmented and the voiced segmentis extracted.Then,the multi-pitch extraction is performed on the voiced segment signal.The dominant pitch is found through Viterbi decoding,and the BP neural network is used to discriminate whether the dominant pitch is vocal pitch.Lastly,The comb-tooth filter is used to reconstruct the speech in the voiced segment or to suppress the interference.The experimental results showed that the algorithm successes extracting speech from mixed-audio which is mixed with music and background noise,and the ratio of speech signal to noise gains 8 dB in average.
作者
蔡良
夏秀渝
陆雄
孙文慧
CAI Liang;XIA Xiuyu;LU Xiong;SUN Wenhui(College of Electronic and Information Engineering,Sichuan University,Chengdu 610065,China)
出处
《成都信息工程大学学报》
2019年第1期1-6,共6页
Journal of Chengdu University of Information Technology
关键词
语音增强
维特比算法
基音跟踪
多基频提取
speech enhancement
viterbi algorithm
pitch tracking
multi-pitch extraction