摘要
结合中文关系抽取的要求,以ACE2005的中文语料为数据进行关系抽取实验.在抽取中文词法、实体、句法,语法基本特征后,提出采用特征组合方法,使用支持向量机的机器学习(SVM)方法,在上关系探测和关系大类上F值分别提高了1.36%和3.97%,达到72.77和61.03,并分析出各部分组合特征的贡献.实验数据表明词语和实体组合特征对中文关系抽取的作用较大.
This paper carried out a series of experiments on Chinese relation extraction classification based on standard and training corpus of ACE2005 (Automatic Content Extraction 2005). It explores word, entity, syntax, gram features in Chinese at first, and then present a method which combines these basic features. The F-score of Chinese relation extraction for Relation Detection and six major types in ACE2005 Chinese corpora improves 1.36% and 3.97% and achieves 72.77 and 61.03 respectively in SVM, then give the contribution of different combined features. It illustrates that the combined features of words and entities are very effective for Chinese Relation Extraction.
出处
《微电子学与计算机》
CSCD
北大核心
2010年第4期198-200,204,共4页
Microelectronics & Computer
基金
国家"八六三"计划项目(2006AA01Z147)
江苏省自然科学基金(60673041)