结合两种距离测度的说话人聚类算法被引量：1

Speaker Clustering Algorithm Based on Two Distance Metrics

下载PDF

导出

摘要说话人聚类研究如何将一段语音中同一说话人的语音聚合.提出一种基于结合广义似然比与归一化交叉似然比两种距离测度的聚类算法.算法首先提取每一段语音信号的MFCC特征,并建立高斯混合模型,最后采用基于结合广义似然比与归一化交叉似然比两种距离测度的层次化策略对语音信号进行聚类.在算法中,贝叶斯判据用以确定聚类结束的条件.实验表明,该算法提高了系统的综合性能,较好的解决了无监督说话人聚类问题.结合两种距离测度比单独使用任何一种距离测度的系统性能提高了6%.并且,通过改进更新类间距的方式,聚类速度相比传统高斯混合模型聚类方法提升6倍. Speaker clustering addresses the problem of grouping a set of speech utterances based on the identity of the speaker of the utterances. In this paper we proposed a novel clustering algorithm based on two distance metrics combining Generalized Likelihood Ratio and Normalized Cross Likelihood Ratio. In our proposal, Mel Frequency Cepstrum Coefficientsare first extracted from speech sampies and modeled by Gaussian Mixture Models to represent the speech. Following a hierarchical clustering scheme is built combining GLR and NCLR metrics. In addition, Bayes Information Criteriais employed as the termination criterion. Experimental results show the cluster performance of combining GLR and NCLR is improved compared with either of them. As well, the efficiency is also improved greatly compared with the traditional GMM cluster method.

作者陈玥同刘学亮

机构地区合肥工业大学计算机与信息学院

出处《小型微型计算机系统》 CSCD 北大核心 2015年第10期2369-2373,共5页 Journal of Chinese Computer Systems

基金国家"八六三"高技术研究发展计划项目(2014AA015104)资助

关键词说话人聚类广义似然比归一化交叉似然比贝叶斯判据 speaker clustering Generalized Likelihood Ratio （ GLR） Normalized Cross Likelihood Ratio （ NCLR ） Bayes Information Criteria（ BIC）

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献16

1Gupta V, Boulianne G, Kenny P, et al. Speaker diarization of French broadcast news [ C~. Proceedings of 1EEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, 2008 : 4365- 4368. 被引量：1
2Xavier Anguera, Simon Bozonnet, Nicholas Evans, et al. Speaker di- arization : a review of recent research~ J]. IEEE Transactions on Au- dio, Speech, and Language Processing,2012,20 ( 2 ) :356-370. 被引量：1
3Mathieu Ben, Michael Betser, Frederic Bimbot, et al. Speaker dia- rization using bottom-up clustering based on parameter-derived dis- tance between adapted GMMs E C ]. Proceedings of the International Conference on Spoken and Language Processing, Kobe, Japan, 2004. 被引量：1
4Zhou Yu, Jin Yi-zhu, Li Gui-lian. Speaker diarization system based on HMM-BIC E J 1. Journal of Tsinghua University ( Science and Technology ), 2011,51 ( 9 ) : 1267 -1270. 被引量：1
5Qin Jin, Kornel Laskowski, Tanja Schultz, et al. Speaker segmenta- tion and clustering in meetings E C ]. Proceedings of NIST 2004 Spring Richtranscription Evaluation Workshop, Montreal, Canada, 2004. 被引量：1
6David Wang, Robert Vogt, Sridha Sridharan, et al. Cross likelihood ratio based speaker clustering using eigenvoice models [ C 1. Pro- ceedings of 12th Annual Conference of the International Speech Communication Association, Florence, Italy ,2011. 被引量：1
7David Wang, Robert Vogt, Sridha Sridharan. Eigenvoice modeling for cross likelihood ratio based speaker clustering: a Bayesian ap- proach [ J]. Computer Speech and Language, 2013,27 ( 4 ) : 1011- 1027. 被引量：1
8蒋晔,唐振民.GMM文本无关的说话人识别系统研究[J].计算机工程与应用,2010,46(11):179-182. 被引量：27
9Zhou-Xi, Dai Bei-qian, Chen Yan-xiang, et al. Unsupervised speaker clustering based on purity and BBN algorithm[ J]. Pattern Recogni- tion and Artificial Intelligence ,2006,18 (4) :486-490. 被引量：1
10Robert B. Dunn, Douglas A. Reynolds, Thomas F. Quaffed. Ap- proaches to speaker detection and tracking in conversational speech [ J ]. Digital Signal Processing ,2000,10 ( 1-3 ) :93-112. 被引量：1

二级参考文献27

1吴尊敬,曹志刚.Improved MFCC-Based Feature for Robust Speaker Identification[J].Tsinghua Science and Technology,2005,10(2):158-161. 被引量：7
2Reynolds D A,Rose R C.Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Transactions on Speech and Audio Processing,1995,3(1):72-83. 被引量：1
3Reynolds D A.Speaker identification and verification using Gaussian mixture speaker model[J].Speech Communication,1995,17:91-108. 被引量：1
4You K H.Wang H C.Joint estimation of feature transformation parameters and Gaussian mixture model for speaker identification[J].Speech Communication,1999,28:227-241. 被引量：1
5Jim Z C.Improvement of the K-means clustering filtering algorithm[J].Pattern Recognition,2008,41 (12):3677-3681. 被引量：1
6Reynolds D A,Thomas F.Speaker verification using adapted Gaus-sian mixture models[J].Digital Signal Processing,2000,10 (1-3):19-41. 被引量：1
7Barras C, Zhu X, Meignier S, et al. Multi stage speaker diarizalion of broadcast news[J].IEEE Transactions on Audio, Speech and Language Processing, 2006, 14(5): 1505 - 1512. 被引量：1
8Deleglise P, Esteve Y, Meignier S, et al. Improvements to the LIUM French ASR system based on CMU Sphinx: what helps to significantly reduce the word error rate? [C]// Interspeech. Brighton, NJ:ISCA, 2009:2123-2126. 被引量：1
9Pardo J L, Anguera X, Wooters X, Speaker diarization for multiple distant microphone meetings using several sources of information [J].IEEE Transactions on Computers, 2007, 56(9) : 1214 - 1224. 被引量：1
10Nguyen H T, Chng E, Li H Z. T-test distance and clustering criterion for speaker diarization [C]//Interspeech. Brisbane, NI, ISCA, 2008, 36-39. 被引量：1

共引文献30

1上官葳,戴蓓蒨.基于话者聚类的多系统输出评分融合话者确认[J].兰州大学学报（自然科学版）,2008,44(3):81-86. 被引量：1
2曹洁,潘鹏.基于GMM的说话人识别技术研究[J].计算机工程与应用,2011,47(11):114-117. 被引量：6
3景新幸,杨艺敏,刘涛.改进PSO-SVM在说话人确认中的应用[J].计算机工程与应用,2011,47(33):106-108.
4霍春宝,张彩娟,赵红敏.基于GMM-UBM的说话人确认系统的研究[J].辽宁工业大学学报（自然科学版）,2012,32(2):98-101.
5王再欢,唐云建,韩鹏.一种利用声音识别的森林盗伐检测方法[J].计算机工程与应用,2012,48(30):216-219. 被引量：3
6郭敏,张明真.基于GMM和聚类方法的储粮害虫声信号识别研究[J].南京农业大学学报,2012,35(6):44-48. 被引量：6
7马振,张雄伟,杨吉斌.一种基于K-SVD的说话人识别方法[J].计算机工程与应用,2012,48(34):112-115. 被引量：2
8祝鹏,王成儒.小波包变换与Teager能量算子结合的说话人识别[J].计算机工程与应用,2013,49(9):187-189. 被引量：2
9成培.移动式智能化广播影视视听节目监管平台解决方案[J].科技创新与应用,2013,3(17):23-23. 被引量：2
10马勇,鲍长春.说话人分割聚类研究进展[J].信号处理,2013,29(9):1190-1199. 被引量：7

同被引文献3

1花城,李辉.小训练语料下基于均值超矢量聚类的说话人确认方法[J].数据采集与处理,2014,29(2):238-242. 被引量：4
2李威,贺前华,李艳雄.一种多说话人角色聚类方法[J].华南理工大学学报（自然科学版）,2015,43(1):21-27. 被引量：2
3王波,钟映春,陈俊彬.融合AP和GMM的说话人识别方法研究[J].广东工业大学学报,2015,32(4):145-149. 被引量：1

引证文献1

1薛雷,张弛,张程浩,章依文.汉语儿童言语发育水平自动评估关键技术的研究[J].工业控制计算机,2019,32(7):74-75.

1吴奎,宋彦,戴礼荣.基于因子分析建模的电话语音说话人聚类[J].模式识别与人工智能,2013,26(1):1-5. 被引量：1
2李春明,李玉山,张大朴,刘洋.多角度不同表情下的人脸识别[J].计算机科学,2006,33(2):223-224. 被引量：4
3孟平,包成刚.互信息、冗余与广义似然比研究[J].计算机应用与软件,2014,31(2):325-329.
4张素敏,苏东林,王炜.改进的基于决策树的说话人在线聚类[J].光学精密工程,2010,18(1):227-233. 被引量：1
5琚映云,周鑫,翟济云.一种基于广义似然比的最小方差活动轮廓模型[J].电光与控制,2016,23(11):73-77.
6谢振华,程江涛,耿昌茂,周德云.自适应模糊控制几个基本问题的研究进展[J].电光与控制,2000,7(2):18-25. 被引量：5
7马晓川,阎杰,陈新海.一种改进的Willsky广义似然比故障检测方法[J].西北工业大学学报,1995,13(1):52-55.
8何桢,左玲,张敏.基于广义似然比的图像数据监控方法[J].系统工程学报,2016,31(1):127-134. 被引量：4
9HUANG ZhenSheng,ZHANG RiQuan.Testing for the parametric parts in a single-index varying-coefficient model[J].Science China Mathematics,2012,55(5):1017-1028.
10赵俊.基于广义似然比的泊松过程变点识别[J].标准科学,2012(11):62-65.

小型微型计算机系统

2015年第10期

浏览历史

内容加载中请稍等...

结合两种距离测度的说话人聚类算法被引量：1

参考文献16

二级参考文献27

共引文献30

同被引文献3

引证文献1

相关作者

相关机构

相关主题

浏览历史

结合两种距离测度的说话人聚类算法 被引量：1

参考文献16

二级参考文献27

共引文献30

同被引文献3

引证文献1

相关作者

相关机构

相关主题

浏览历史

结合两种距离测度的说话人聚类算法被引量：1