摘要
随着以隐马尔科夫模型为基础的语音合成技术的发展,冒认者很容易利用该技术生成具有目标说话人特性的合成语音,这对现有的声纹识别系统构成巨大威胁.针对此问题,文中从统计学的角度分析自然语音与合成语音在实倒谱上的区别,并提出对合成语音具有鲁棒性的声纹识别系统.实验结果初步表明,相比于传统的声纹识别系统,在对自然语音的等错误率不变的情况下,该系统对合成语音的错误接受率由99.2%降为0.
With the development of the hidden raarkov model (HMM) based speech synthesis technology, it is easy for impostors to produce synthetic speech with the specific speakerb characteristics, which becomes an enormous threat to the existing speaker recognition system. In this paper, the difference between natural speech and synthetic speech is investigated on the real part of cepstrum. And a speaker recognition system is proposed which is robust against synthetic speech. Experimental results demonstrate that the false accept rate (FAR) for synthetic speech is zero in the proposed system, while that of the existing speaker recognition system is 99. 2% with the equal error rate (EER) for natural speech unchanged.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2011年第6期743-747,共5页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.60970161)
中央高校基本科研业务费专项项目(No.XD2100060001)资助
关键词
声纹识别
合成语音
实倒谱
Speaker Recognition, Synthetic Speech, Real Part of Cepstrum