摘要
针对考虑状态持续时间的 HMM,在非线性动态规划的基础上设计了改进的 Viterbi算法 ,并给出了 Viterbi算法和 K- means聚类相结合的语音识别过程 ,最后分别以一般和考虑状态持续时间的 HMM及 Viterbi算法对 50个汉语音节进行了识别实验。结果表明 ,考虑状态持续时间并应用改进的 Viterbi算法时 ,虽然语音训练过程要慢一些 ,但其识别速度几乎是一样的 ,而且误识率有明显的降低 。
Vaseghi′s consideration of state duration [3] is, in our opinion, not reasonable in one important respect, which is quite complicated. In section 2, we report how this one important respect should be changed to make it reasonable. We, like Vaseghi, use eq.(7) to calculate transition probability a ij (d i) . But concerning how to make use of a ij (d i) in considering state duration, we and Vaseghi hold different views. More importantly, Vaseghi considered the state duration for a certain state s i at a certain time to be a fixed value, but we consider that the speech vector can move along any of many possible paths, hence the state duration can have many different possible values. Our view requires eqs.(10) through (17) in section 2 to be reflected fully. In section 3, the training and recognition process using the improved Viterbi algorithm and K means clustering is introduced. Finally, experiments are carried out for 50 Chinese phones using standard and the improved Viterbi algorithm respectively. Results show that, with the improved algorithm, although training speed is slower, recognition speed is almost the same, and recognition error rate may be reduced greatly.
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2000年第4期595-599,共5页
Journal of Northwestern Polytechnical University