摘要
基于非负矩阵分解的语音去噪,在提高语音信号信噪比的同时,也会引起语音失真,从而导致噪声环境下说话人确认系统性能下降.本文提出基于分区约束非负矩阵分解的语音去噪方法(Nonnegative Matrix Factorization with Partial Constrains,PCNMF),目的是在未知和非平稳噪声条件下提高话人确认系统的鲁棒性.PCNMF在满足分区约束条件的基础上分别构建语音字典和噪声字典.考虑到传统语音训练产生的语音字典往往含有一定的噪声成分,PCNMF通过数学模型产生基音及泛音频谱,在此基础上利用该频谱模仿人声的共振峰结构来合成字典,从而保证语音字典纯净性.另一方面,为了克服传统噪声字典构建方法带来的部分噪声信息丢失问题,PCNMF对在线分离出的噪声样本进行分帧和短时傅里叶变换,然后以帧为单位线性组合生成噪声字典.性能评估实验引入了多种噪声类型,实验结果表明PCNMF可有效提高说话人确认系统的鲁棒性,特别是在未知和非平稳噪声条件下其等错率相比基线系统(Multi-Condition)平均降低了5.2%.
While nonnegative matrix factorization based speech enhancing methods can improve signal to noise ratio (SNR) of recovered speech signal,these methods lead to the speech distortion,and thus degrade the performance of speaker verification system under noisy environment.This paper proposes a nonnegative matrix factorization with partial constrains (PCNMF),with objective of enhancing the robustness of speaker verification system in presence of unknown and unstable noises.PCNMF constructs the speech and noise dictionaries while satisfying partition conditions.Considering that the speech dictionary generated by traditional speech training contains a little noise element,PCNMF generates speech dictionary using the spectra of pitch and their harmonics via mathematical model,and accordingly imitates the formant structure of human voice.The purpose is to guarantee the purity of speech dictionary.In addition,in order to alleviate the problem about the loss of the information of the noise sample,PCNMF performs framing operation and Short-Time Fourier Transform against the noise samples separated online,and then generates noise dictionary by means of linear combination of the spectrum frames of the noise samples.Our experiment takes unknown and unstable noises into account,demonstrating that the proposed PCNMF achieves significant improvement of robustness under various noise conditions.Particularly,the equal error rate of PCNMF is reduced by an average of 5.2% in comparison with the base-line (Multi-Condition system).
作者
张二华
王明合
唐振民
ZHANG Er-hua;WANG Ming-he;TANG Zhen-min(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2019年第6期1244-1250,共7页
Acta Electronica Sinica
基金
国家自然科学基金(No.61473154)
关键词
语音处理
说话人确认
非负矩阵分解
加性噪声
speech processing
speaker verification
nonnegative matrix factorization
additive noise