摘要
文章建立了隐马尔可夫模型(HMM)状态数研究的简单模型,并从信息论的角度出发,对HMM中状态数的选择进行研究,得出HMM信息熵的三个结论,指出了HMM的信息熵由固有熵和附加熵两部分组成,而附加熵又由正附加熵和负重叠附加熵构成。在一定重叠程度下,随着状态数的增加,附加熵逐渐趋向零,从而导致HMM的信息熵渐趋于固有熵。考虑到信息熵的变化趋势,得出语音识别时HMM状态数并非越多越好的结论;指出了汉语单字HMM的状态数取6为宜。
This paper establishes a simplified model to study the principle of selection Of the number of the state in Hidden Markov Model(HMM). On the view of the information theory, three conclusions of HMM's information entropy are also drawn. The results show that the information entropy of HMM consists of intrinsic entropy and additional entropy, at the same time, the latter consists of positive additional entropy and negative overlapping additional entropy. At a certain amount of overlap, additional entropy tends gradually to 0 with the increasement of the number of the state, thus, the information entropy tends towards intrinsic entropy. In isolated word recognition of Chinese, considering the trend of information entropy, the result is obtained that the number of the state in HMM about 6 is the best selection.
出处
《计算机工程与应用》
CSCD
北大核心
2000年第1期67-69,133,共4页
Computer Engineering and Applications
关键词
语音识别
隐马尔可夫模型
信息熵
Speech Recognition, Hidden Markov Model, Information Entropy