期刊文献+

结合字形特征与迭代学习的金融领域命名实体识别 被引量:15

Utilizing Glyph Feature and Iterative Learning for Named Entity Recognition in Finance Text
下载PDF
导出
摘要 针对中文金融文本领域的命名实体识别,该文从汉字自身特点出发,设计了结合字形特征、迭代学习以及双向长短时记忆网络和条件随机场的神经网络模型。该模型是一种完全端到端且不涉及任何特征工程的模型,其将汉字的五笔表示进行编码以进行信息增强,同时利用迭代学习的策略不断对模型整体预测结果进行改进。由于现有的命名实体识别研究在金融领域缺乏高质量的有标注的语料库资源,所以该文构建了一个大规模的金融领域命名实体语料库HITSZ-Finance,共计31210个文本句,包含4类实体。该文在语料库HITSZ-Finance上进行了一系列实验,实验结果均表明模型的有效性。 To deal with Chinese named entity recognition in finance domain,this paper presents a novel neural network model combining glyph feature and iterative learning,Based on the framework of bidirectional long-short term memory networks and conditional random fields,this model encodes wubi input code of Chinese characters for information enhancement and use iterative learning to continuously update predict results.We manually annotate a large-scale financial named entity corpus named HITSZ-Finance,including 31210 sentences and 4 types of entities.Experiment results on HITSZ-Finance corpus demonstrate the effectiveness of the model.
作者 刘宇瀚 刘常健 徐睿峰 骆旺达 陈奕 吉忠晟 应能涛 LIU Yuhan;LIU Changjian;XU Ruifeng;LUO Wangda;CHEN Yi;JI Zhongsheng;YING Nengtao(School of Computer Science,Harbin Institute of Technology(Shenzhen),Shenzhen,Guangdong 518055,China)
出处 《中文信息学报》 CSCD 北大核心 2020年第11期74-83,共10页 Journal of Chinese Information Processing
基金 国家自然科学基金(61632011,61876053) 深圳市基础研究项目(JCYJ20180507183527919,JCYJ20180507183608379) 深圳市技术攻关项目(JSGG20170817140856618)。
关键词 金融领域命名实体识别 中文语料库 深度学习 named entity recognition in financial field Chinese corpus deep learning
  • 相关文献

参考文献1

二级参考文献19

  • 1Pang B. , Lee L. , Vaithyanathan S. Thumbs up?: sentiment classification using machine learning tech- niques [C]//Proceedings of the ACL. 2002: 79-86. 被引量:1
  • 2Xu R. F, Wong K. F, Xia Y. Coarse-Fine opinion min- ing-WIA in NTCIR-7 MOAT task [C]//Proceedings of NTCIR. 2008: 307-313. 被引量:1
  • 3Tan S. , Zhang J. An empirical study of sentiment a- nalysis for Chinese documents [J]. Expert Systems with Applications, 2008, 34(4): 2622-2629. 被引量:1
  • 4Socher R. , Perelygin A. , Wu J. Y. , et al. Recursive deep models for semantic compositionality over a senti- ment Treebank [C]//Proceedings of the EMNLP. 2013: 1631-1642. 被引量:1
  • 5Kim Y. Convolutional neural networks for sentence classification [C]//Proceedings of the EMNLP. 2014: 1746-1751. 被引量:1
  • 6Wang S. , Manning C. D Baselines and bigrams: Sim- ple, good sentiment and topic classification [C]//Pro- ceedings of the ACL. 2012: 90-94. 被引量:1
  • 7Bollegala D., Weir D., Carroll J. Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification [C]//Proceed- ings of the ACL. 2011: 132-141. 被引量:1
  • 8Bengio Y. , Ducharme R. , Vincent P. , et al. A neural probabilistic language model [J]. The Journal of Ma- chine Learning Research, 2003, 3.. 1137-1155. 被引量:1
  • 9Mnih A. , Hinton G. E A scalable hierarchical distrib-uted language model [C]//Proceedings of the NIPS. 2009 : 1081-1088. 被引量:1
  • 10Mikolov T. , Sutskever I. ,Chen K. , et al. Distribu- ted representations of words and phrases and their compositionality [C]//Proceedings of the NIPS. 2013: 3111-3119. 被引量:1

共引文献48

同被引文献104

引证文献15

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部