摘要
提出了一个基于浊音语音谐波谱重建的说话人识别算法.该算法根据浊音语音短时频谱的结构特征和基音信息,对浊音语音谐波结构频谱进行子带加权重建,以补偿由噪声引起的训练与测试条件的失配.算法基于重建浊音频谱提取感知线性预测倒谱系数,与基音相组合作为说话人的语音特征参数矢量,采用高斯混合模型对说话人进行建模.仿真实验的结果表明:所提出的浊音谱重建方法对多种类型含噪语音的噪声补偿均具良好效果,可以明显提高在噪声环境下的与文本无关的说话人识别的识别率,特别是显著提高低信噪比环境下的识别率,而不会明显降低纯净语音和高信噪比环境下的识别率.
A speaker recognition algorithm based on harmonic spectrum reconstruction of voiced speech is proposed.In the proposed approach,according to the spectral character and pitch information of original speech,the harmonic spectrum of voiced segment is reconstructed with the sub-band weighting method to compensate the acoustic mismatches caused by noises between training and testing conditions.The perceptual linear predictive cepstrum coefficient is extracted from the reconstructed spectrum and is combined with pitch to form a speech feature vector of a giving speaker.Speaker is modeled by Gaussian mixture model.Simulation results indicate that the approach of the voiced speech spectrum reconstruction proposed in this paper is very effective for the noise compensation in many noisy speech conditions.For the text independent speaker recognition,the recognition accuracy is significantly improved by this method in the noisy environments,especially in low SNR environments,and there is no remarkable degradation in clean speech and high SNR environments.
出处
《东南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第6期935-941,共7页
Journal of Southeast University:Natural Science Edition
基金
国家重点基础研究发展计划(973计划)资助项目(2002CB312102)
江苏省普通高校自然科学研究计划资助项目(07KJD510110).
关键词
说话人识别
频谱重建
感知线性预测倒谱系数
噪声补偿
谱平坦度测度
speaker recognition
spectrum reconstruction
perceptual linear predictive cepstrum coefficient
noise compensation
spectral flatness measure