摘要
为了克服传统语音增强算法对语音信号和噪声信号各种假设前提的依赖,并且提升语音增强性能,本文在基于深层神经网络的语音增强方法的基础上,提出了一种联合噪声分类和卷积神经网络的时频掩码估计方法。该算法综合考虑到多种类型噪声混合的带噪信号会对训练的卷积神经网络的预测精度产生不同程度的影响,其通过噪声分类识别,自适应被各类噪声污染的语音信号的时频掩码的估计,并利用语音活性检测方法对预测的掩码进行后修正。实验结果表明,该算法在多种噪声环境下取得更大的信噪比增益。
To avoid making unreasonable assumptions for speech and noise signals in traditional speech enhancement methods and promote the performance of speech enhancement,this paper proposed the algorithm joint noise classification and convolutional neural networks(CNN)for speech enhancement,based on deep neural networks(DNN)methods.The proposed algorithm takes information of the fact that unclassified noise will decrease accuracy of the prediction model in training phase and makes specific training for each type noisy speech signal by noise classification.And it adds post-refinement using voice activity detection(VAD).Experimental results show that the proposed algorithm makes a greater promotion on signal to noise ratio(SNR)
作者
凌佳佳
袁晓兵
LING Jia-jia;YUAN Xiao-bing(Science and Technology on Microsystem Laboratory,Shanghai Institute of Microsystem and Information Technology,Chinese Academy of Sciences,Shanghai 200050,China;School of Information Science and Technology,ShanghaiTech University,Shanghai 201210,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《电子设计工程》
2018年第17期30-34,共5页
Electronic Design Engineering
关键词
语音增强
时频掩码
卷积神经网络
噪声分类
speech enhancement
time-frequency mask
convolutional neural networks
noise classification