摘要
为了探讨高斯混合模型在说话人识别中的作用,设计了一个基于GMM的说话人识别系统。整个系统由音频信号预处理,语音活动检测,说话人模型建立以及音频信号识别4个模块组成。前三个模块构成了系统的模型训练部分,最后一个模块构成了系统的语音识别部分。包含在第二个模块中的由GMM模型搭建的语音活动检测器是研究的创新之处。利用增强的多方互动会议语料库中的视听会议对系统中的部分可调参数以及系统的识别错误率进行了测试。仿真结果表明,在语音活动检测器和若干滤波算法的帮助下,系统对包含重叠语音的音频信号的识别准确率可以达到83.02%。
In order to investigate the function of Gaussian Mixture Model(GMM) in speaker recognition,a GMM based speaker recognition system is designed.The system consists of four modules that are audio signal pre-processing,speech activity detection,speaker modeling as well as audio signal recognition.The first three modules constitute the model training segment of the system and the last module constitutes the speech recognition segment of the system.A speech activity detector which is built by GMM in the second module is the innovation of the research.Some tunable parameters and recognition error rate of the system are tested using audio-visual meetings in the Augmented Multi-party Interaction(AMI) corpus.Simulations show that with the help of the speech activity detector and several filter algorithms,recognition accuracy rate of the system for audio signal with overlap speech can reach 83.02%.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第11期114-117,共4页
Computer Engineering and Applications
基金
甘肃省自然科学基金No.1010RJZA046
甘肃省教育厅研究生导师基金项目(No.0914ZTB003)~~
关键词
高斯混合模型
语音活动检测
识别错误率
Gaussian Mixture Mode(lGMM)
speech activity detection
recognition error rate