
基于CRF算法的汉语比较句识别和关系抽取 被引量:22

Mining Chinese comparative sentences and relations based on CRF algorithm
摘要 比较句是表明事物之间关系的常见表达方式,对于文本挖掘,特别是情感分析,具有重要的价值。目前汉语比较句的研究还是一个新颖的课题,包括汉语比较句的识别和比较关系的抽取。对于汉语比较句的识别,在前人研究的基础上,以SVM为分类器,以特征词和CSR序列规则为特征,同时利用CRF算法抽取实体对象,并增加以实体对象的信息作为特征,显著提高了比较句识别的准确率、召回率和F-度量,最高分别达到96.55%、88.63%和92.43%。对于汉语比较关系的抽取,在CRF算法抽取实体对象的基础上,通过定义一些规则,抽取比较主体和比较基准,也取得了较好的效果,其中比较主体的抽取效果要好于比较基准。 Comparative sentences are a common kind of expression to indicate the relations of different objects. They are valuable for text mining, especially for opinion mining. It is a novel research to identify Chinese comparative sentences and extract comparative relations. To identify Chinese comparative sentences, this paper took SVM as classifier and regarded keywords and class sequential rule as feature based on the previous research, and then used CRF algorithm to identify entity and also took the entity’s information as feature. Finally, remarkably improve the precision, recall and F-measure for identifying comparative sentences and got the result up to 96.55%, 88.63% and 92.43% respectively. To mine comparative relations, extracted comparative subject and objected by defining some rules together with the result of CRF algorithm for identifying entity, and obtained good result. And the result to extract comparative subject is better than comparative object.
出处 《计算机应用研究》 CSCD 北大核心 2010年第6期2061-2064,共4页 Application Research of Computers
基金 国家自然科学基金资助项目(60773087)
关键词 比较句 比较关系 CRF模型 比较主体 比较基准 comparative sentence comparative relation CRF model comparative subject comparative object
  • 相关文献


  • 1JINDAL N,LIU Bing.Identifying comparative sentences in text documents[C] //Proc of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,2006:244-251. 被引量:1
  • 2JINDAL N,LIU Bing.Mining comparative sentences and relations[C] //Proc of the 21st National Conference on Artificial Intelligence.Boston:AAAI Press,2006:1331-1336. 被引量:1
  • 3FELDMAN R,FRESKO M,GOLDENBERG J.Extracting product comparisons from discussion boards[C] //Proc of the 7th IEEE Inter-national Conference on Data Mining.Washington DC:IEEE Compu-ter Society,2007:469-474. 被引量:1
  • 4SUN Jian-tao,WANG Xuan-hui,SHEN Dou,et al.CWS:a comparative Web search system[C] //Proc of the 15th International Conference on World Wide Web.New York:ACM Press:2006:467-476. 被引量:1
  • 5LUO Gang,TANG Chun-qiang,TIAN Ying-li.Answering relationship queries on the Web[C] //Proc of the 16th International Conference on World Wide Web.New York:ACM Press,2007:561-570. 被引量:1
  • 6许国萍著..现代汉语差比范畴研究[M].上海:学林出版社,2007:217.
  • 7车竞.现代汉语比较句论略[J].湖北师范学院学报(哲学社会科学版),2005,25(3):60-63. 被引量:23
  • 8刘焱著..现代汉语比较范畴的语义认知基础[M].上海:学林出版社,2004:324.
  • 9黄小江,万小军,杨建武,肖建国.汉语比较句识别研究[J].中文信息学报,2008,22(5):30-38. 被引量:16
  • 10LAFFERTY J D,McCALLUM A,PEREIRA F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data[C] //Proc of the 18th International Conference on Machine Learning.San Framcisce CA:Morgan Kaufmann Publishers Inc,2001. 被引量:1


  • 1车竞.现代汉语比较句论略[J].湖北师范学院学报(哲学社会科学版),2005,25(3):60-63. 被引量:23
  • 2黄德根,王莹莹.基于SVM的组块识别及其错误驱动学习方法[J].中文信息学报,2006,20(6):17-24. 被引量:6
  • 3[8]相原茂.汉语比较句的两种否定形式[M].语言教学与研究.1992. 被引量:1
  • 4[9]徐燕青."不比"型比较句的语义类型[M].语言教学与研究.1996. 被引量:1
  • 5N. JINDAL, B. LIU. Identifying Comparative Sentences in Text Documents [C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM: 2006:244 251. 被引量:1
  • 6N. JINDAL, B. LIU. Mining Comparative Sentences and Relations[C]//Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06). 2006. 被引量:1
  • 7C. ZHAI, A. VELIVELLI, B. YU. A Cross Collection Mixture Model for Comparative Text Mining [C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM: 2004:743-748. 被引量:1
  • 8P. ZANG, C. ZHAI. CTMS: a comparative text mining system[D]. Champaign.- University of Illinois at Urbana-Champaign Computer Science Department, 2004. 被引量:1
  • 9J.-T. SUN, X. WANG, D. SHEN, H.-J. ZENG, Z. CHEN. CWS: A Comparative Web Search System [C]//Proceedings of the 15th International Conference on World Wide Web. ACM:2006: 467-476. 被引量:1
  • 10G. LUO, C. TANG, Y.-L. TIAN. Answering relationship queries on the web[C]//Proceedings of the 16th international conference on World Wide Web. ACM: 2007: 561-570. 被引量:1












使用帮助 返回顶部