摘要
针对数字语音信号分帧提取MFCC参数,MFCC是Mel频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)。Mel频率是基于人耳听觉特性提出的,它与Hz频率呈非线性对应关系,利用它们之间的这种关系,计算得到Hz频谱特征。将每帧的MFCC作为矢量进行自组织特征映射神经网络矢量量化及LBG矢量量化,通过实验对二者进行比较。仿真结果表明,自组织特征映射神经网络矢量量化算法得到的码书比LBG算法具有量化误差小、码本尺寸小的特点,进而可以节省存储空间。实验结果表明,文中方法具有一定的实用性。
MFCC parameter is extracted from digital speech frame, and MFCC is Mel Frequency Cepstrum Coefficients. Mel frequency is proposed based on human auditory characteristics, and it reflects nonlinear relationship with Hz frequency. The Hz frequency spectrum characteristics is calculated by their relationship. The MFCC of each frame is taken as vector for vector quantization of self-organizing feature maps neural network and LBG,and they are compared by experiment. Simulation shows that the self-organizing feature maps neu- ral network algorithm has smaller quantization error and code size than LBG algorithm, saving the space of memory. The experiment dem- onstrates the method is feasible.
出处
《计算机技术与发展》
2016年第9期175-177,182,共4页
Computer Technology and Development
基金
青海省自然科学基金(2013-Z-920)