摘要
非投影结构是指依存树上的词语节点与原句中的词语序列出现错位的现象,对于句法分析器的影响较大,在语言理论上也有较大研究价值。在世界多种语言的依存树或图库上,都发现了含有非投影结构的句子,并对比展开了相关研究。而汉语的非投影结构尚未得到重视,语料库构建过程中也因遵循了投影性原则而缺乏对非投影结构的标注。该文基于概念对齐版的中文AMR语料库,在10 149句语料上统计出带有非投影结构的句子比例为31.62%,其三种主要类型为模态词提升、话题化和成分分离,并提出了相应的自动分析方案,以提高中文AMR自动分析效果。
The non-projective structure refers to the phenomenon that the word nodes on the dependency tree are misplaced with different word sequence in the original sentence.It has not been discussed in Chinese,following only the projection principle in the construction of Chinese dependency corpus.In this paper,we construct a Chinese abstract meaning representation(AMR)corpus of 10 149 sentences,in which 31.62% sentences have non-projective structures.Then we distinguish the three main types of the non-projective structures,modal words,topicalization and the component separation.Finally,we provide the solutions for the structures in the AMR parsing.
作者
闻媛
宋丽
吴泰中
李斌
周俊生
曲维光
WEN Yuan;SONG Li;WU Taizhong;LI Bin;ZHOU Junsheng;QU Weiguang(School of Chinese Language and Literature,Nanjing Normal University,Nanjing,Jiangsu 210097,China;School of Computer Science and Technology,Nanjing Normal University,Nanjing,Jiangsu 210023,China;Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, Minjiang University,Fuzhou,Fujian 350121,China)
出处
《中文信息学报》
CSCD
北大核心
2018年第12期31-40,共10页
Journal of Chinese Information Processing
基金
国家社会科学基金(18BYY127)
关键词
抽象语义表示
概念对齐
非投影
语义分析
中文信息处理
abstract meaning representation
concept-to-word alignment
non-projective
semantic parsing
Chinese information processing