期刊文献+

基于多层次特征集成的中文实体指代识别 被引量:1

Chinese Entity Mention Detection Based on Multi-level Feature Integration
下载PDF
导出
摘要 实体指代识别(Entity Mention Detection,EMD)是识别文本中对实体的指代(Mention)的任务,包括专名、普通名词、代词指代的识别。本文提出一种基于多层次特征集成的中文实体指代识别方法,利用条件随机场模型的特征集成能力,综合使用字符、拼音、词及词性、各类专名列表、频次统计等各层次特征提高识别性能。本文利用流水线框架,分三个阶段标注实体指代的各项信息。基于本方法的指代识别系统参加了2007年自动内容抽取(ACE07)中文EMD评测,系统的ACE Value值名列第二。 The purpose of Entity Mention Detection (EMD) is to recognizel all mentions of entities in a document, involving recognition of named entities, noun words and pronoun coreference etc. In this paper, we propose an approach for Chinese entity mention detection by integrating multi-level features into the Conditional Random Fields (CRFs) framework. These features used include characters, phonetic symbols, lexical words and part-of-speech, named entities, and frequency statistics. All EMD subtasks are integrated into a three-stage pipeline framework in which three different CRFs classifiers are used to label different attributes sequentially in a predefined order. The system described here is the our submission to NIST ACE07 EMD Evaluation project, and achieved rank-2 performance in ACE07.
出处 《中文信息学报》 CSCD 北大核心 2007年第5期126-130,共5页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60473140) 国家863高科技计划资助项目(2006AA01Z154) 国家教育部新世纪优秀人才计划资助项目(NCET-05-0287) 国家985工程计划资助项目(985-2-DB-C03)
关键词 计算机应用 中文信息处理 实体指代识别 多任务标注 条件随机场模型 ACE评测 computer applicatiopn Chinese information processing entity mention detection mutil-task labeling conditional random fields ACE evaluation
  • 相关文献

参考文献15

  • 1The ACE 2007 (ACE07) Evaluation Plan v1.3.http://www.nist.gov/speech/tests/ace07/doc/. 被引量:1
  • 2K.Hacioglu,B.Douglas,Y.Chen.Detection of Entity Mentions Occurring in English and Chinese Text[A].In:Proceedings of HLT/EMNLP-2005[C].Vancouver:2005.379-386. 被引量:1
  • 3R.Florian,H.Hassan,A.Ittycheriah et al.A Statistical Model for Multilingual Entity Detection and Tracking[A].In:Proceeding of HLT-NAACL 2004[C].Boston:2004,1-8. 被引量:1
  • 4G.D.Zhou,J.Su.Named Entity Recognition using an HMM-based Chunk Tagger[A].In:Proceeding of the 40th Annual Meeting of the ACL[C].Philadelphia:2002,473-480. 被引量:1
  • 5刘非凡,赵军,吕碧波,徐波,于浩,夏迎炬.面向商务信息抽取的产品命名实体识别研究[J].中文信息学报,2006,20(1):7-13. 被引量:47
  • 6吴雪军,朱靖波,王会珍,等.Co-Training的机器学习方法在中文机构名识别中的应用[A].全国第七届计算语言学联合学术会议[C].2003.85-90. 被引量:1
  • 7J.Lafferty,A.McCallum,F.Pereira.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[A].International Conference on Machine Learning (ICML01)[C].2001.282-289. 被引量:1
  • 8W.L.Chen,Y.J.Zhang,H.Isahara.Chinese Named Entity Recognition with Conditional Random Fields[A].In:Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing[C].Sydney:2006.118-121. 被引量:1
  • 9R.Florian,H.Jing,N.Kambhatla et al.Factorizing Complex Models:A Case Study in Mention Detection[A].In:Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL[C].Sydney:2006.473-480. 被引量:1
  • 10H.Daume III,D.Marcu.A Large-Scale Exploration of Effective Global Features for a Joint Entity Detection and Tracking Model[A].In:Proceedings of HLT/EMNLP-2005[C].Vancouver:2005.379-386. 被引量:1

二级参考文献9

  • 1John M.Pierre. Mining Knowledge from Text Collections Using Automatically Generated Metadata [A]. In: Proceedings of Fourth International Conference on Practical Aspects of Knowledge Management [C].London, UK: Springer-Verlag, 2002, 537- 548. 被引量:1
  • 2Bick, Eekhard. A Named Entity Recognizer for Danish[A]. In:IAno et al. (eds.), Proc. of 4th International Conf.on Language Resources and Evaluation(LRE2004)[C], Lisbon, 2004, 305-308. 被引量:1
  • 3Jian Sun, Jianfeng Gao, Lei Zhang, Ming Zhou, Changning Huang. Chinese Named Entity Identification Using Class-based Language Model [A]. In:Proceedings of the 19th international conference on Computational Linguistics[C]. Morristown, NJ, USA, Association for Computational Linguistics, 2002, 1 - 7. 被引量:1
  • 4Huaping Zhang, et al. Chinese NER Using Role Model [J]. Special Issue of the International Journal of Computational Linguistics and Chinese Language Processing, 2O03, 8(2):29 - 60. 被引量:1
  • 5Guohong Fu and Kang-Kwong Lake. Chinese Unknown Word Identification Using Clags-based LM[A]. In:Proceedings of the First International JointConference on Natural Language Processing (IJCNLP- 04) [C]. Hainan, China,2004, 262-269. 被引量:1
  • 6Tzong-Han Tsai, et al. Mencius: A Chinese Named Entity Recognizer Using the Maximum Entropy-based Hybrid Model [J]. International Journal of Computational Linguistics & Chinese Language Processing, 2004, 9(1):62- 82. 被引量:1
  • 7Cheng Niu, Wei Li, Jihong Ding and Rohini K. Srihari. A Bootstrapping Approach to Named Entity Classification Using Successive Learners [A]. In: Proceedings of the 41st ACL [C], Sappom, Japan, 2003, 335- 342. 被引量:1
  • 8Shai Fine, Yoram Singer, Naftali Tishby. (1998) The Hierarchical Hidden Markov Model: Analysis and Applications[J]. btachine Learning. 1998, 32(1): 41-62. 被引量:1
  • 9Y. Z. Wu, J. Zhao, B. Xu. Chinese Named Entity Recognition Combining Statistical Model with Human Knowledge[A]. Workshop of 41st ACL: nuhilingual and Mix-language NER[C], Sapporo, Japan, 2003, 65 - 72. 被引量:1

共引文献48

同被引文献9

  • 1Linguistic Data Consortium. ACE (Automatic Content Extraction ) English Annotation Guidelines for Entities Version 6. 1 [ EB/OL]. [ 2008 - 03 - 29 ]. http ://projects. ldc. upenn, edu/ace. 被引量:1
  • 2ZHOU GD, SU J. Named Entity Recognition Using an HMM-based Chunk Tagger[ C ]. In: Proceedings of the 40^th Annual Meeting of the Association for Computation Linguistics, Philadelphia. USA : Association for Computational Linguistics,2002:473 -480. 被引量:1
  • 3Bender O, Ney H. Maximum Entropy Models for Named Entity Recognition [ C ]. In: Proceedings of the Conference on Computational Natural Language Learning, Edmonton, Canada. USA: Association for Computational Linguistics, 2003 : 148 - 151. 被引量:1
  • 4Lafferty J, McCallum A, Pereira F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Seqquence Data [J]. The Journal of Manchine Learning Research,2001, ICML01 : 282 - 289. 被引量:1
  • 5Hacioglu K, Douglas B, Chen Y. Detection of Entity Mentions Occurling in English and Chinese Text [ C ]. In : Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Cannada. USA : Association for Computational Linguistics ,2005 (10) : 379 - 386. 被引量:1
  • 6The ACE 2008 Evaluation Plan. Assessment of Detection and Recognition of Entities and Relations Within and Across Documents [ EB/ OL]. [2008 -05 -07 ]. http://www, nist. gov/speeeh/tests/aee/ ace08/doc/. 被引量:1
  • 7Sutton C, McCallum A, Rohanimanesh K. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data [ J ]. The Journal of Machine Learning Research,2007,8 ( 3 ) :693 - 723. 被引量:1
  • 8廖先桃.CRF理论、工具包的使用及在NE上的应用[R/OL].[2008- 04 -02 3. http ://ir. hit. edu. cnfphpwebsite/index, php? module = doeuments&JAS_ DoeumentManager_ op = downloadFile &JAS_File_id = 215. 被引量:1
  • 9Florian R, Hassan H, Jing H, et al. Factorizing Complex Models : A Case Study in Mention Detection [ J ]. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. 2006 (9) :473 -480. 被引量:1

引证文献1

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部