期刊文献+

融合多模型与高置信度词典的事件线索检测 被引量:2

Combining Multiple Models and High-Confidence Dictionary for Event Nugget Detection
下载PDF
导出
摘要 提出一种融合多模型和高置信度词典的事件线索识别方法,将高置信度词典特征分别加入最大熵模型和条件随机场模型,然后融合两个模型的结果,旨在提高触发词识别的召回率和整体性能。针对事件真伪性识别任务,进一步考察否定词或不确定词与触发词的物理位置距离和依存路径距离等特征,提高事件真伪性识别的性能。实验结果显示,针对触发词识别和事件真伪性识别任务,与仅使用最大熵模型相比,所提出的融合多模型与高置信度词典的方法能够提高触发词识别的性能6.43%,提高事件真伪性识别的性能1.69%。 This paper proposes a method that combines multiple models and high-confidence dictionary for eventnugget detection.This method introduces dictionary features into maximum entropy model and conditional randomfields model respectively,then combines the results of two models.In addition,the lexical length and the length ofthe dependency path between the trigger and negation or speculation in event realis recognition are considered toimprove the accuracy of event realis detection.Compared to the method based on maximum entropy model,theexperiment results show that proposed method can get6.43%gain of F1in event nugget recognition and1.69%gain of F1in event realis recognition.
作者 陈亚东 洪宇 王潇斌 杨雪蓉 姚建民 朱巧明 CHEN Yadong;HONG Yu;WANG Xiaobin;YANG Xuerong;YAO Jianmin;ZHU Qiaoming(Provincial Key Laboratory of Computer Information Processing Technology, Soochow University, Suzhou 215006)
出处 《北京大学学报(自然科学版)》 EI CAS CSCD 北大核心 2017年第3期412-420,共9页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 国家自然科学基金(61373097 61272259 61272260)资助
关键词 事件线索检测 最大熵模型 条件随机模型 高置信度词典 event nugget detection Maximum Entropy Conditional Random Fields high-confidence dictionary
  • 相关文献

参考文献1

二级参考文献9

  • 1骆正清,陈增武,胡上序.一种改进的MM分词方法的算法设计[J].中文信息学报,1996,10(3):30-36. 被引量:28
  • 2Nianwen Xue.Chinese word segmentation as character tagging[J]. International Journal of Computational Linguistics and Chinese Language Processing,2003,8(1):29-48. 被引量:1
  • 3Huihsin Tseng,Pichuan Chang,Galen Andrew,et al.A conditional random field word segmenter for sighan bakeoff 2005[C]//Proceedings of the fourth SIGHAN workshop.2005:168-171. 被引量:1
  • 4Yue Zhang,Stephen Clark.Chinese segmentation with a word-based perceptron algorithm[C]//Proceedings of the 45th ACL.2007:840-847. 被引量:1
  • 5Xu Sun,Yaozhong Zhang,Takuya Matsuzaki,et al.A discriminative latent variable chinese segmenter with hybrid word/character information[C]//Proceedings of NAACL.2009:56-64. 被引量:1
  • 6Hai Zhao,Chang-Ning Huang,Mu Li.An Improved Chinese Word Segmentation System with Conditional Random Field[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing. 2006:162-165. 被引量:1
  • 7Pi-Chuan Chang,Michel Galley,Christopher D.Manning.Optimizing Chinese Word Segmentation for Machine Translation Performance[C]//ACL Workshop on Statistical Machine Translation.2008:224-232. 被引量:1
  • 8John D. Lafferty,Andrew McCallum,Fernando C.N.Pereira. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of ICML.2001:282-289. 被引量:1
  • 9吴春颖,王士同.基于二元语法的N-最大概率中文粗分模型[J].计算机应用,2007,27(12):2902-2905. 被引量:12

共引文献43

同被引文献26

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部