摘要
人们理解自然语言通常是在篇章级进行的,随着词汇级及句子级研究的日益成熟,自然语言处理研究的焦点已转向篇章级。篇章分析的主要任务就是从整体上分析出篇章结构及其构成单元之间的语义关系,并利用上下文理解篇章。根据不同的篇章分析目的,篇章单元及其关系可以表示为不同的篇章基本结构,不同篇章基本结构及其关系的研究可提供不同层面的篇章理解。目前对汉语篇章内在规律的研究较少,缺乏对篇章进行有效分析和深入理解的理论方法体系,这严重制约了篇章级的相关研究及应用。重点关注篇章的两个最基本特征,即衔接性和连贯性,从篇章结构分析的理论研究、资源建设和计算模型这3个方面,分别探讨篇章修辞结构(体现篇章连贯性)和话题结构(体现篇章衔接性),对篇章理解的国内外研究现状进行了归纳和整理,并给出了目前存在的主要问题和研究趋势。
Natural language is usually understood from discourse perspective.With the success of research at the lexical and sentence levels,the focus of natural language processing research has shifted to the discourse level.The object of discourse analysis is to analyze the text structure and the semantic relationship between discourse units,and thus understand the text.According to different purposes,discourse units and their relationships can be expressed as different textual structures,and the study of them can provide different levels of text comprehension.Currently,there are few studies on the inherent laws of Chinese texts and the lack of theory and method system for effective analysis and in-depth understanding of Chinese discourse has seriously restricts the relevant research and application.This study focuses on two basic features of a text,namely cohesion and coherence.From three aspects of theoretical research,resource construction,and computational model of discourse analysis,it explores the rhetorical structure(reflecting text coherence)and topic structure(reflecting text cohesion)respectively.It summarizes the current research,and presents the main problems and research trends.
作者
孔芳
王红玲
周国栋
KONG Fang;WANG Hong-Ling;ZHOU Guo-Dong(Laboratory for Natural Language Processing, School of Computer Science and Technology, Soochow University, Suzhou 215006, China;Jiangsu Key Laboratory of Computer Information Processing Technology, Suzhou 215006, China)
出处
《软件学报》
EI
CSCD
北大核心
2019年第7期2052-2072,共21页
Journal of Software
基金
国家自然科学基金(61751206,61876118,61673290)~~
关键词
自然语言理解
篇章分析
篇章修辞结构
篇章话题结构
natural language understanding
discourse analysis
discourse rhetorical structure
discourse topic structure