摘要
汉语中词的离合是指词的构成元素( 两个或多个汉字) 之间的结合不很紧密,可以在其间插入某些其它成分而被分离,但被分离的词所表达的基本语义不变的语法现象。本文从大规模语料库中对汉语离合词进行了详细的统计分析,并给出了BT863 汉英机器翻译系统中汉语离合词的处理策略。
Chinese “LIHECI” refers to the words whose components (two or more characters) have a weak connection and other constituents can be inserted without causing meaning change. This paper conducts a detailed statistical analysis of Chinese “LIHECI” based on large scale corpus and proposes a solution strategy,which has been successfully adopted in BT863 Chinese English machine translation system.
出处
《情报学报》
CSSCI
北大核心
1999年第4期303-307,共5页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金
国家863 高科技项目基金
关键词
机器翻译
语料库
汉语离合词
machine translation,corpus,Chinese “LIHECI”.