摘要
针对说话人识别中线性预测系数(LPC)和梅尔频率倒谱系数(MFCC)的直接组合会增加特征参数的维数和出现运算量大的问题,提出了将LPC参数融入到MFCC参数的计算中的特征提取方法。首先,计算语音信号的LPC系数,求出LPC功率谱;其次,将LPC功率谱通过三角形滤波器组,并取对数;最后,将取对数后的输出做离散余弦变换,得到新特征线性预测梅尔频率倒谱系数(LPMFCC)。LPMFCC参数兼具LPC参数的声道特性和MFCC参数的听觉特性,虽增加了一步计算,但不增加参数的维数,运算量相对较少。实验结果表明,在纯净语音环境下,提出的LPMFCC参数的说话人识别率较LPC参数和MFCC参数的说话人识别率分别提升了18.57%和10%,在不同噪声环境下,分别提高了13.22%和4.55%。
Focusing on the issue that direct combination of Linear Prediction Coefficient( LPC) and Mel Frequency Cepstrum Coefficient( MFCC) will increase the dimension of the feature parameters and lead to heavy computation,the method by integrating LPC parameters into the computation of MFCC parameters was proposed. Firstly,LPC parameters from speech singal were calculated and the speech power spectrum of LPC were gotten; Secondly,the logarithm of output by making the speech power spectrum of LPC through triangular filter group was conducted. Finally,the output of logarithm was transformed by discrete cosine transform,and a new feature factor which is called Linear Prediction Mel Frequency Cepstrum Coefficient( LPMFCC) was obtained. LPMFCC parameters had both vocal track of LPC parameters and auditory of MFCC parameters. Although increasing a step of computation,the dimension of parameters was not increased and computation cost was relatively low. The simulation results show that the speaker recognition rate of the proposed LPMFCC parameters promotes by 18. 57% and 10% than the speaker recognition rate of LPC parameters and MFCC parameters respectively in the pure voice database,while by 13. 22% and 4. 55% respectively in various noise environments.
出处
《计算机应用》
CSCD
北大核心
2015年第A02期242-244,共3页
journal of Computer Applications
基金
国家自然科学基金资助项目(60972147)
关键词
说话人识别
梅尔频率倒谱系数
线性预测系数
矢量量化
高斯混合模型
speaker recognition
Mel frequency cepstrum coefficient
linear prediction coefficient
Vector Quantization(VQ)
Gaussian Mixture Model(GMM)