摘要
现代汉语句法与英语句法不同,具有明显复杂性,一是不容易获得完整的规则集,二是整句剖析所得结果含有大量的歧义结构难以消除。使用分治的策略将句法剖析任务分为不同层面的小任务,逐层进行句法剖析是一种可行有效的方法。其基本思想是:首先采用多层马尔可夫模型对句子进行短语组块剖析,将整个句子分割为名词组块、动词组块等短语语块,然后在此基础上运行CYK剖析算法,剖析组块间的依存关系,最终实现对完整语句的句法分析,浅层剖析简化了CYK算法规则集,在一定程度上降低了句法剖析难度。
Different from English,modern Chinese syntax has obvious complexities: one is not easy to get the complete set of rules;the second,sentence of the analytical results contains a lot of ambiguous structures which are difficult to eliminate.Decomposition policy can divide syntax analysis tasks into different levels of small tasks,which rather than on the complete syntactic analysis is feasible.The basic idea is that first of all,multi-layer Markov model was used to parse a sentence which cut apart the complete sentence to some phrase about noun phrase,verb phase,etc.On the basis of the chunk,CYK algorithm was run to analyze the dependencies of the chunk,and ultimately realize the complete sentence syntactic analysis.Shallow parsing simplified rule set of CYK algorithm,and reduce the syntax parsing to some extent.
出处
《计算机应用》
CSCD
北大核心
2011年第5期1335-1338,1446,共5页
journal of Computer Applications
关键词
浅层剖析
隐马尔可夫模型
剖析树
依存关系
shallow parsing
Hidden Markov Model(HMM)
parse tree
dependence relationship