期刊文献+

汉语功能块自动分析 被引量:13

Automatic Parsing of Chinese Functional Chunks
下载PDF
导出
摘要 汉语功能块描述了句子的基本骨架,是联结句法结构和语义描述的重要桥梁。本文提出了两种不同功能块分析模型:边界识别模型和序列标记模型,并使用不同的机器学习方法进行了计算模拟。通过两种模型分析结果的有机融合,充分利用了两者分析结果的互补性,对汉语句子的主谓宾状四个典型功能块的自动识别性能达到了80%以上。实验结果显示,基于局部词汇语境机器学习算法可以从不同侧面准确识别出大部分功能块,句子中复杂从句和多动词连用结构等是主要的识别难点。 Chinese functional chunks are defined as a series of non-overlapping, non-nested skeleton segments of a sentence, representing the implicit grammatical relations between the sentence-level predicates and their arguments. In this paper, we proposed two statistical models for parsing four main functional chunks in a sentence. In the chunk boundary detection model, we focus on building the sub models based on SVM algorithm for detecting SP (subjectpredicate) and PO (predicate-object) boundaries. In the sequence labeling model, we formulate the chunking task as a sequence labeling problem and base our model on CRF algorithm, By introducing some revision rules, we build a combined parsing model which integrates the advantages of both statistical models and have achieved the best F- Score of 82.93%, 86, 58%, 78.46% and 86.64% for subject, predicate, object and adverb functional chunks respectively. Experimental results show that the complex clauses and serial verb structures are the main recognition difficulties.
作者 周强 赵颖泽
出处 《中文信息学报》 CSCD 北大核心 2007年第5期18-24,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(6057318560 520130299)
关键词 计算机应用 中文信息处理 汉语功能块 边界识别模型 序列标记模型 模型融合 computer application Chinese information processing functional chunk boundary recognition model sequence labeling model model merging
  • 相关文献

参考文献16

  • 1Lance A.Ramshaw and Mitchell P.Marcus.Text Chunking Using Transformation-Based Learning[A].In:Proceedings of the Third ACL Workshop on Very Large Corpora8[C].Cambridge MA,USA:1995. 被引量:1
  • 2Erik F.Tjong Kim Sang and Sabine Buchholz.Introduction to CoNLL-200 Shared Task:Chunking[A].In:Proceedings of CoNLL-2000 and LLL-2000[C].Lisbon,Portugal:2000.127-132. 被引量:1
  • 3Erik F.Tjong Kim Sang and Herv D jean.Introduction to the CoNLL-2001 Shared Task:Clause Identification[A].In:Proceedings of CoNLL-2001[C].Toulouse,France:2001.53-57. 被引量:1
  • 4Xavier Carreras and Llus Marquez.Introduction to the CoNLL-2004 shared task:Semantic role labeling[A].In:Proceedings of the Conference on Computational Natural Language Learning (CoNLL)[C].Boston,MA:May,2004. 被引量:1
  • 5Xavier Carreras and Llu s M arquez.Introduction to the CoNLL-2005 Shared Task:Semantic Role Labeling[A].In:Proceedings of the CoNLL-2005[C].2005. 被引量:1
  • 6周强,任海波,詹卫东.构建大规模汉语语块库[A].黄昌宁,张普主编自然语言理解与机器翻译[C].北京:清华大学出版社,2001.102-107. 被引量:2
  • 7Steven Abney.Parsing By Chunks[A].In:Robert Berwick,Steven Abney and Carol Tenny (eds.),Principle-Based Parsing[C].Kluwer Academic Publishers,Dordrecht.1991. 被引量:1
  • 8Yingze Zhao,Qiang Zhou A SVM-based Model for Chinese Functional Chunk Parsing[A].In:Proc.of the Fifth SIGHAN Workshop on Chinese Language Processing[C].Sydney:2006.94-101. 被引量:1
  • 9Vladimir N.Vapnik.The Nature of Statistical Learning Theory[M].Springer,1995. 被引量:1
  • 10John Lafferty,Fernando Pereira,and Andrew McCallum.Conditional random fields:Probabilistic models for segmenting and labeling sequence data[A].In:International Conference on Machine Learning (ICML'01)[C].2001.282-289. 被引量:1

共引文献4

同被引文献164

引证文献13

二级引证文献64

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部