摘要
本文提出了在特征提取阶段利用帧间相关性的一种方法。对每一帧考虑其前后各n帧,加上自身帧共2n+1帧的特征矢量串起来组合成一个大的特征矢量串。对这个大的特征矢量串用Karhunen-Loeve变换进行降维处理,将变换后的数据作为本帧的特征矢量用于后续的训练和识别。在基于CDCPM的语音识别系统中采用这种方法进行了音节的训练和识别,实验结果表明Karhunen-Loeve变换在考虑帧间相关性的特征提取阶段上表现了良好的效果,有着很广阔的应用前景。
We present a novel method to incorporate temporal correlation into a speech recognition system in feature extracting phase. The temporal correlation is considered to be useful for recognition because of the fact that the speech features of the present frame are highly informative about the feature characteristics of neighboring frames. In this paper, by combining the current frame and its adjacent frames we can obtain a new feature vector and then we process this new feature vector by Karhunen-Loeve transform to reduce vector dimension. At last we take the result as the feature characteristics of the present, frame. We have done a lot. of experiments about the idea in a speech recognition system based on CDCPM. The. results in the experiments of speaker-independent Chinese syllable recognition show that Karhunen-Loeve transform applied in feature extracting phase while considering temporal correlation is very useful and worthy furthermore researching.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
1998年第4期396-402,共7页
Pattern Recognition and Artificial Intelligence
关键词
语音识别
KL变换
特征提取
语音信号处理
KL Transform, Central Distance Normal Distribution, Central Distance Continuous Probability Model, Continuous HMM