摘要
说话人聚类研究如何将一段语音中同一说话人的语音聚合.提出一种基于结合广义似然比与归一化交叉似然比两种距离测度的聚类算法.算法首先提取每一段语音信号的MFCC特征,并建立高斯混合模型,最后采用基于结合广义似然比与归一化交叉似然比两种距离测度的层次化策略对语音信号进行聚类.在算法中,贝叶斯判据用以确定聚类结束的条件.实验表明,该算法提高了系统的综合性能,较好的解决了无监督说话人聚类问题.结合两种距离测度比单独使用任何一种距离测度的系统性能提高了6%.并且,通过改进更新类间距的方式,聚类速度相比传统高斯混合模型聚类方法提升6倍.
Speaker clustering addresses the problem of grouping a set of speech utterances based on the identity of the speaker of the utterances. In this paper we proposed a novel clustering algorithm based on two distance metrics combining Generalized Likelihood Ratio and Normalized Cross Likelihood Ratio. In our proposal, Mel Frequency Cepstrum Coefficientsare first extracted from speech sampies and modeled by Gaussian Mixture Models to represent the speech. Following a hierarchical clustering scheme is built combining GLR and NCLR metrics. In addition, Bayes Information Criteriais employed as the termination criterion. Experimental results show the cluster performance of combining GLR and NCLR is improved compared with either of them. As well, the efficiency is also improved greatly compared with the traditional GMM cluster method.
出处
《小型微型计算机系统》
CSCD
北大核心
2015年第10期2369-2373,共5页
Journal of Chinese Computer Systems
基金
国家"八六三"高技术研究发展计划项目(2014AA015104)资助
关键词
说话人聚类
广义似然比
归一化交叉似然比
贝叶斯判据
speaker clustering
Generalized Likelihood Ratio ( GLR)
Normalized Cross Likelihood Ratio ( NCLR )
Bayes Information Criteria( BIC)