摘要
在基于全差异空间因子(i-Vector)的说话人确认系统中,需进一步从语音段的i-Vector表示中提取说话人相关的区分性信息,以提高系统性能.文中通过结合锚模型的思想,提出一种基于深层置信网络的建模方法.该方法通过对i-Vector中包含的复杂差异信息逐层进行分析、建模,以非线性变换的形式挖掘出其中的说话人相关信息.在NIST SRE 2008核心测试电话训练-电话测试数据库上,男声和女声的等错误率分别为4.96%和6.18%.进一步与基于线性判别分析的系统进行融合,能将等错误率降至4.74%和5.35%.
In i-vector based speaker verification system, it is necessary to extract the discriminative speaker information from i-vectors to further improve the performance of the system. Combined with the anchor model, a deep belief network based speaker-related information extraction method is proposed in this paper. By analyzing and modeling the complex variabilities contained in i-vectors layer-by-layer, the speaker-related information can be extracted with non-linear transformation. The experimental results on the core test of NIST SRE 2008 show the superiority of the proposed method. Compared with the linear discriminant analysis based system, the equal error rates(EER) of male and female trials can be reduced to 4.96% and 6.18% respectively. Furthermore, after the fusion of the proposed method with linear discriminant analysis, the EER can be reduced to 4.74% and 5.35%.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2013年第12期1089-1095,共7页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61273264)
国家973前期研究专项项目(No.2012CB326405)资助