A separation method of singing and accompaniment combining discriminative training deep neural network 被引量：2

A separation method of singing and accompaniment combining discriminative training deep neural network

导出

摘要 For the difficulty of separation between singing and accompaniment in the musical signals,an improved music separation method of based on discriminative training depth neural network(DNN) was proposed.Firstly,based on the DNN model,considering the reconstruction errors and discrimination information between singing and accompaniment,an improved objective function was presented to discriminate the training;Then,an additional layer was added to DNN model,introducing the time-frequency masking to optimize the estimated accompaniment of the song,and the corresponding time-domain signal was obtained by inverse Fourier transform;Finally,the influence of different parameters on the separation performance was verified,and compared it with the existing music separation methods.The experimental results showed that the improved objective function and the introduction of time-frequency masking significantly improved the separation performance of the DNN,and the separation performance was improved about 4 dB compared with other existing music separation methods,thus verifying that the proposed method was an effective music separation algorithm. For the difficulty of separation between singing and accompaniment in the musical signals,an improved music separation method of based on discriminative training depth neural network(DNN) was proposed.Firstly,based on the DNN model,considering the reconstruction errors and discrimination information between singing and accompaniment,an improved objective function was presented to discriminate the training;Then,an additional layer was added to DNN model,introducing the time-frequency masking to optimize the estimated accompaniment of the song,and the corresponding time-domain signal was obtained by inverse Fourier transform;Finally,the influence of different parameters on the separation performance was verified,and compared it with the existing music separation methods.The experimental results showed that the improved objective function and the introduction of time-frequency masking significantly improved the separation performance of the DNN,and the separation performance was improved about 4 dB compared with other existing music separation methods,thus verifying that the proposed method was an effective music separation algorithm.

作者 ZHANG Tianqi XIONG Mei ZHANG Ting YANG Qiang

机构地区 Chongqing Key Laboratory of Signal and Information Processing

出处《Chinese Journal of Acoustics》 CSCD 2019年第2期227-239,共13页 声学学报（英文版）

基金 supported by the National Natural Science Foundation of China(61671095,61371164,61702065,61701067,61771085) the Project of Key Laboratory of Signal and Information Processing of Chongqing(CSTC2009CA2003) Chongqing Graduate Research and Innovation Project(CYS17219) the Research Project of Chongqing Educational Commission(KJ130524,KJ1600427,KJ1600429)

关键词 DNN A separation method of SINGING and ACCOMPANIMENT COMBINING discriminative training DEEP NEURAL network

分类号 TP183 [自动化与计算机技术—控制理论与控制工程] TN912.3 [自动化与计算机技术—控制科学与工程]

引文网络
相关文献

参考文献3

1梁山,刘文举,江巍.基于噪声追踪的二值时频掩蔽到浮值掩蔽的泛化算法[J].声学学报,2013,38(5):632-637. 被引量：5
2张天骐,徐昕,吴旺军,刘瑜.多反复结构模型的精确音乐分离方法[J].声学学报,2016,41(1):135-142. 被引量：11
3尹辉,谢湘,匡镜明.基于听觉模型与自适应分数阶Fourier变换的声学特征在语音识别中的应用[J].声学学报,2012,37(1):97-103. 被引量：14

二级参考文献31

1TAO Ran,DENG Bing,WANG Yue.Research progress of the fractional Fourier transform in signal processing[J].Science in China(Series F),2006,49(1):1-25. 被引量：99
2Wang D L. On ideal binary mask as the computational goal of auditory scene analysis, in Speech separation by humans and machines, Kluwer Academic Pub, 2005:181-197. 被引量：1
3Li Y, Wang D L. On the optimality of ideal binary time- frequency masks. Speech Communication, 2009; 51: 230- 239. 被引量：1
4Kim G, Lu Y, Loizou P C. An algorithm that improves speech intelligibility in noise for normal-hearing listeners. The Journal of the Acoustical Society of America, 2009;126:1486-1495. 被引量：1
5Hu G, Wang D L. Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Trans. Neural Net., 2004; 15(5): 1135 1150. 被引量：1
6Hu G, Wang D L. A tandem algorithm for pitch estimation and voiced speech segregation. IEEE Trans. Audio Speech Lang. Process., 2010; 18:2067 2079. 被引量：1
7Rangachari S, Loizou P C. A noise-estimation algorithm for highly non-stationary environments. Speech Communi- cation, 2006; 48:220-231. 被引量：1
8Hendriks R C, Jensen J, Heusdens R. Noise tracking us- ing DFT domain subspace decomposition. IEEE Trans. Audio Speech Lang. Process., 2008; 16:541-553. 被引量：1
9Metropolis N, Rosenbluth A W, Rosenbluth M N, Teller A, Teller H. Equations of state calculations by fast computing machines. J. Chem. Phys., 1953; 21:1087-1091. 被引量：1
10Wasserman L. All of statistics: a concise course in statisti- cal inference, ch.6, Springer-Verlag, Berlin, 2003. 被引量：1

共引文献25

1刘扬,张苗辉,郑逢斌.听觉选择性注意的认知神经机制与显著性计算模型[J].计算机科学,2013,40(6):283-287. 被引量：6
2鲜晓东,樊宇星.基于Fisher比的梅尔倒谱系数混合特征提取方法[J].计算机应用,2014,34(2):558-561. 被引量：16
3翟慧强,张金萍,王丹,赵艳春.听觉模型综述[J].机械工程师,2014(3):19-22. 被引量：5
4钱思冲,向阳,李恒,李胜杨,施雨骁,李瑞.基于计算听觉场景分析的内燃机噪声源分离方法[J].内燃机学报,2015,33(1):63-70. 被引量：1
5钱思冲,向阳,李胜杨,李恒.基于独立分量分析与二值掩膜的语音分离[J].华中科技大学学报（自然科学版）,2015,43(7):87-92. 被引量：2
6顾玲玲,张晓俊,黄程韦,吴迪,周孝进,陶智.息肉与麻痹喉声源分类中非线性动力学发声系统模型研究[J].声学学报,2015,40(6):878-885. 被引量：4
7ZHANG Tianqi,XU Xin,WU Wangjun,LIU Yu.Music/voice separation based on the multi-repeating structure of Mel cepstrum coefficient[J].Chinese Journal of Acoustics,2015,34(4):424-435. 被引量：4
8GU Lingling,ZHANG Xiaojun,HUANG Chengwei,WU Di,ZHOU Xiaojin,TAO Zhi.Polyps and paralysis phonation classification with nonlinear dynamics model[J].Chinese Journal of Acoustics,2016,35(1):84-96. 被引量：2
9李国富,黎洁,高大治,王宁.利用环境噪声互相关实现散射体无源成像[J].声学学报,2016,41(1):49-58. 被引量：14
10张天骐,徐昕,吴旺军,刘瑜.多反复结构模型的精确音乐分离方法[J].声学学报,2016,41(1):135-142. 被引量：11

同被引文献1

1韩伟,张雄伟,闵刚,张启业.基于感知掩蔽深度神经网络的单通道语音增强方法[J].自动化学报,2017,43(2):248-258. 被引量：18

引证文献2

1柏浩钧,张天骐,刘鉴兴,叶绍鹏.联合精确比值掩蔽与深度神经网络的单通道语音增强方法[J].声学学报,2022,47(3):394-404. 被引量：5
2BAI Haojun,ZHANG Tianqi,LIU Jianxing,YE Shaopeng.Monaural speech enhancement combining accurate ratio mask and deep neural network[J].Chinese Journal of Acoustics,2022,41(4):373-389.

二级引证文献5

1许春冬,徐锦武,王茹霞,凌贤鹏,黄乔月,郭桥生.结合LSTM与ResNet的声学回声消除[J].传感器与微系统,2023,42(5):29-32. 被引量：1
2张琳,王海涛,杨爽,曾向阳,陈克安.面向舱室声学环境的深度时域语音增强网络[J].声学学报,2023,48(4):890-900.
3于博文,曾庆宁.基于差分阵列和时频掩蔽的语音增强算法[J].计算机仿真,2024,41(5):366-371.
4张刚敏,李雅荣,贾海蓉,王鲜霞,段淑斐.基于多任务自适应知识蒸馏的语音增强[J].太原理工大学学报,2024,55(4):720-726.
5刘山民,徐珑婷.基于图傅里叶变换的语音增强算法研究[J].计算机科学与应用,2023,13(4):689-697.

1王功明,乔俊飞,王磊.一种能量函数意义下的生成式对抗网络[J].自动化学报,2018,44(5):793-803. 被引量：15
2John Abraham Mathews,Madhavi Vindlacheruvu,Vikas Khanduja.Is there a weekend effect in hip fracture patients presenting to a United Kingdom teaching hospital?[J].World Journal of Orthopedics,2016,7(10):678-686. 被引量：3
3Jiayou Xu,Hongyu Wu,Zhi Wang,Zhihua Qiao,Song Zhao,Jixiao Wang.Recent advances on the membrane processes for CO_2 separation[J].Chinese Journal of Chemical Engineering,2018,26(11):2280-2291. 被引量：6
4Huan-gang WANG,Xin LI,Tao ZHANG.Generative adversarial network based novelty detection using minimized reconstruction error[J].Frontiers of Information Technology & Electronic Engineering,2018,19(1):116-125. 被引量：3
5廖峰乙,何培宇,崔敖,徐自立.一种基于SOCP和IFT的频率-方向不变波束形成方法[J].成都信息工程大学学报,2017,32(2):122-127. 被引量：2
6毛凯,孙校书,杨树杰,刘丹.基于时滞分割技术的时滞神经网络系统时滞相依全局稳定性分析[J].海军航空工程学院学报,2019,34(2):239-244.
7Hao-Ren Wang,Juan Lei,Ao Li,Yi-Hong Wu.A Geometry-Based Point Cloud Reduction Method for MobileAugmented Reality System[J].Journal of Computer Science & Technology,2018,33(6):1164-1177.
8徐佳晙,赵宇明.基于空间置信蒙版与PCA-HOG特征的目标跟踪算法[J].信息技术,2019,43(5):143-147.
9Shichao Feng,Jizhong Ren,Dan Zhao,Hui Li,Kaisheng Hua,Xinxue Li,Maicun Deng.Effect of poly(ethylene glycol) molecular weight on CO_2/N_2 separation performance of poly(amide-12-b-ethylene oxide)/poly(ethylene glycol) blend membranes[J].Journal of Energy Chemistry,2019,28(1):39-45.
10ROCKS & MINERALS DETERMINATION AND ANALYSIS[J].Abstracts of Chinese Geological Literature,2018,34(4):52-53.

Chinese Journal of Acoustics

2019年第2期

浏览历史

内容加载中请稍等...