期刊文献+

基于基频状态和帧间相关性的单通道语音分离算法 被引量:1

Single-channel speech separation based on pitch state and interframe correlation
下载PDF
导出
摘要 提出一种基于基频状态和帧间相关性的单通道混合语音分离算法.首先,从混合语音中提取2个源语音的基频进行状态编码,基于编码的基频状态构造自适应字典,并通过引入基频信息在字典层面对各源语音信号进行区分.然后,采用频繁模式挖掘算法,提取基频状态为1时字典的频繁1项子集,缩减字典尺寸.最后,以基于正交匹配追踪的分离语音为基础,检测分离效果差的混合语音帧,搜索与其相关度最高的平移后的邻近分离语音帧进行叠加,并采用软掩蔽方法进行第二次分离校正.仿真实验结果表明,该算法获取的分离语音信噪比优于现有的2种经典语音分离算法,并且该算法采用频繁模式挖掘算法大大减小了运算量. A single-channel speech separation algorithm based on pitch state and interframe correla-tion is proposed.First,the pitch of two simultaneously active speakers is tracked from mixture over time and encoded by pitch states.On this basis,adaptive source-individual dictionaries are generated to distinguish source frames in pitch.Secondly,a frequent pattern mining method is utilized to find the frequent 1-itemset as atoms to reduce the sizes of the dictionaries generated for the sources whose pitch states are 1 .Thirdly,based on the separated sources achieved by the orthogonal matching pur-suit (OMP)algorithm,mixed frames with poor separation performance are detected.Each is added with the shifted separated source frame which is the most correlated one among all the shifted wave-forms of adjacent separated sources,and the soft mask method is adopted to perform the second sep-aration.The experimental results show that the proposed algorithm outperforms two classical separa-tion methods in terms of signal-to-noise ratio (SNR).Besides,the frequent pattern mining method can greatly reduce the computation cost of the separation algorithm.
出处 《东南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2014年第6期1099-1104,共6页 Journal of Southeast University:Natural Science Edition
基金 国家自然科学基金资助项目(61302152 61201345 61271240) 现代信息科学与网络技术北京市重点实验室开放课题资助项目(XDXX1308)
关键词 语音分离 稀疏分解 正交匹配追踪 基频 数据挖掘 speech separation sparse decomposition orthogonal matching pursuit (OMP) pitch frequency data mining
  • 相关文献

参考文献11

  • 1Schmidt M N, Olsson R K. Single-channel speech sepa-ration using sparse non-negative matrix factorization E C/OL l//International Conference on Spoken Lan- guage Processing. Pittsburgh, PA, USA, 2006. ht- tp ://eprints. pascal-network, org/archive/00002722/ 01/imm4511 _ 01. pdf. 被引量:1
  • 2Cooke M P, Barker J, Cunningham S P, et al. An au- dio-visual corpus for speech perception and automatic speech recognition [ J ]. The Journal of the Acoustical Society of America, 2006, 120(5) : 2421 - 2424. 被引量:1
  • 3Wohlrnayr M, Stark M, Pernkopf F. A probabilistic interac- tion model for mulfipitch tracking with factorial hidden Markov models [ J ]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 799-810. 被引量:1
  • 4Weiss R J, Ellis D P W. Speech separation using speak- er-adapted eigenvoice speech models [J ]. Computer Speech &Language, 2010, 24(1): 16-29. 被引量:1
  • 5郭海燕,杨震,朱卫平.一种新的基于稀疏分解的单通道混合语音分离方法[J].电子学报,2012,40(4):762-768. 被引量:5
  • 6Stark M, Wohlmayr M, Pernkopf F. Source-filter-based single-channel speech separation using pitch information [ J ]. IEEE Transactions on Audio, Speech, and Lan- guage Processing, 2011, 19 ( 2 ) : 242 - 255. 被引量:1
  • 7Virtanen T. Monaural sound source separation by non- negative matrix factorization with temporal continuity and sparseness criteria [ J ]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15 (3) : 1066 - 1074. 被引量:1
  • 8Moussallam M, Richard G, Daudet L. Audio source separation informed by redundancy with greedy multi- scale decompositions [ C ]//Proceedings of the 20th Eu- ropean European Signal Processing Conference. Bur- charest, Romania, 2012 : 2644 - 2648. 被引量:1
  • 9Shao Y, Srinivasan S, Jin Z, et al. A computational auditory scene analysis system for speech segregation and robust speech recognition [ J ]. Computer Speech & Language, 2010, 24(1): 77-93. 被引量:1
  • 10Pati Y C, Rezaiifar R, Krishnaprasad P S. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition [ C]//1993 Conference Record of the Twenty-Seventh Asilomar Conference on Signals, Systems and Computers. Pacif- ic Grove, CA, USA, 1993: 40-44. 被引量:1

二级参考文献11

  • 1Schmidt M N, Olsson R K. Linear regression on sparse features for single-channel speech separation[ A]. IEEE Workshop on App/ications of Signal Processing to Audio and Acoustics[ C]. NY, USA, 2007.26 - 29. 被引量:1
  • 2Pearlmutter B, Olsson R. Linear program differentiation for sin- gle - channel speech separation[A]. 16th IEEE, Signal Process- ing Society Workshop on Machine learning for Signal Process- ing[ C]. Maynooth, Ireland, 2006.421 - 426. 被引量:1
  • 3Nakashizuka N, Okumura H, figuni Y. Single-channel speech separation by using a sparse decomposition with periodic struc- ture[ A ]. 2008 International Symposium on Intelligent Signal Processing and Communications Systems[ C ]. Bangkok, Thai- land, 2008.1 - 4. 被引量:1
  • 4Elad M, Bruckstein A, A generalized uncertainty principle and sparse representation in pairs of bases[ J]. IEEE Transactions on Information Theorv, 2002,48: 2558 - 2567. 被引量:1
  • 5Donoho D L. For Most Large Underdetermined Systems of E- quations, the Minimal 11 - norm Near- Solution Approximates the Sparsest Near-Solution[ R]. http://www-stat, stanford, edu / - donoho/Reports/2004. 被引量:1
  • 6Y Pail, R Rezaiifar, and P Krishnaprasad. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition[ A]. Proceedings of 27th Annual Asilo- mar Conference on Signals, Systems and Computers[ C]. CA, USA, 1993.40 - 44. 被引量:1
  • 7Gil-Jin Jang, Te-Won Lee, Yung-Hwan Oh. Single-channel signal separation using time-domain basis functiom[ J]. IEEE Signal Processing Letters,2003, 10(6) : 168 - 171. 被引量:1
  • 8Yun-Kyung Lee, Oh-Wook Kwon. Application of shape analy- sis techniques for improved CASA-based speech separation[J]. IEEE Transactions on Consumer Electronics, 2009, 55 (1):146- 149. 被引量:1
  • 9石光明,刘丹华,高大化,刘哲,林杰,王良君.压缩感知理论及其研究进展[J].电子学报,2009,37(5):1070-1081. 被引量:710
  • 10孙玉宝,吴敏,韦志辉,肖亮,冯灿.基于稀疏表示的脑电棘波检测算法研究[J].电子学报,2009,37(9):1971-1976. 被引量:7

共引文献4

同被引文献1

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部