摘要
提出一种基于基频状态和帧间相关性的单通道混合语音分离算法.首先,从混合语音中提取2个源语音的基频进行状态编码,基于编码的基频状态构造自适应字典,并通过引入基频信息在字典层面对各源语音信号进行区分.然后,采用频繁模式挖掘算法,提取基频状态为1时字典的频繁1项子集,缩减字典尺寸.最后,以基于正交匹配追踪的分离语音为基础,检测分离效果差的混合语音帧,搜索与其相关度最高的平移后的邻近分离语音帧进行叠加,并采用软掩蔽方法进行第二次分离校正.仿真实验结果表明,该算法获取的分离语音信噪比优于现有的2种经典语音分离算法,并且该算法采用频繁模式挖掘算法大大减小了运算量.
A single-channel speech separation algorithm based on pitch state and interframe correla-tion is proposed.First,the pitch of two simultaneously active speakers is tracked from mixture over time and encoded by pitch states.On this basis,adaptive source-individual dictionaries are generated to distinguish source frames in pitch.Secondly,a frequent pattern mining method is utilized to find the frequent 1-itemset as atoms to reduce the sizes of the dictionaries generated for the sources whose pitch states are 1 .Thirdly,based on the separated sources achieved by the orthogonal matching pur-suit (OMP)algorithm,mixed frames with poor separation performance are detected.Each is added with the shifted separated source frame which is the most correlated one among all the shifted wave-forms of adjacent separated sources,and the soft mask method is adopted to perform the second sep-aration.The experimental results show that the proposed algorithm outperforms two classical separa-tion methods in terms of signal-to-noise ratio (SNR).Besides,the frequent pattern mining method can greatly reduce the computation cost of the separation algorithm.
出处
《东南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2014年第6期1099-1104,共6页
Journal of Southeast University:Natural Science Edition
基金
国家自然科学基金资助项目(61302152
61201345
61271240)
现代信息科学与网络技术北京市重点实验室开放课题资助项目(XDXX1308)
关键词
语音分离
稀疏分解
正交匹配追踪
基频
数据挖掘
speech separation
sparse decomposition
orthogonal matching pursuit (OMP)
pitch frequency
data mining