期刊文献+

基于词性合并的浅层句法分析方法研究

Research on the Method of Shallow Syntactic Analysis Based on Word Combination
下载PDF
导出
摘要 互联网已是一个海量的开放式知识库,如何提取里面有价值的信息,成为当今研究的热点。而网页作为互联网信息承载的载体,有其独特的特点,如形式多样、有网页标题等,对网页文本信息进行抽取并结构化是知识库构建的基础。本文对网页信息进行正文信息抽取、代词消解、文本信息提取等处理过程,并提出基于词性合并的浅层句法分析方法,能更好地适应文本信息内容。 The Intemet has become a massive open knowledge base. How to extract valuable information from the Intemet has become a hot topic in today's research. As the carrier of Intemet information, webpage has its unique characteristics. Webpages contain many features, such as various forms and page titles. Extracting and structuring web information is the foundation of building knowledge base. This paper processes the webpage information with text information extraction, pronoun digestion and so on. It proposes a shallow syntactic analysis method based on word combination, which can better adapt to text information content.
作者 刘利 LIU Li(Luzhou Vocational and Technical College,Luzhou 646005,Sichua)
出处 《电脑与电信》 2018年第8期18-20,共3页 Computer & Telecommunication
基金 泸州职业技术学院院级科研课题 项目编号:K-1716 泸州市社科联项目 项目编号:LZ18A031
关键词 文本信息 知识库构建 信息提取 词性合并 text information building knowledge base information extraction word combinatio
  • 相关文献

参考文献4

二级参考文献48

  • 1张晓艳,王挺,陈火旺.命名实体识别研究[J].计算机科学,2005,32(4):44-48. 被引量:67
  • 2俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:160
  • 3冯冲,陈肇雄,黄河燕.采用主动学习策略的组织机构名识别[J].小型微型计算机系统,2006,27(4):710-714. 被引量:12
  • 4Scheffer T, Decomain C, et al. Active Hidden Markov Models for Information Extraction [C]// Proceedings of the International Symposium on Intelligent Data Analysis. Lisbon, Portugal: Springer, 2001: 309-318. 被引量:1
  • 5Scheffer T, Wrobel S. Active learning of partially hidden markov models [C]// Proceedings of the ECML/PKDD Workshop on Instance Selection. Germany: ECML-PKDD, 2001: 1-15. 被引量:1
  • 6Engelson S A, Dagan I. Committee-based sample selection for probabilistic classifiers [J]. Journal of Artificial Intelligence Research (S1076-9757), 1999, 11(2): 335-460. 被引量:1
  • 7Andrew K, McCallum K, Nigam.Employing EM and pool-based active leaming for text classification [C]// Proceedings of the International Conference on Machine Learning. USA: ICML, 1998: 359-367. 被引量:1
  • 8Tong S, Koller D. Support vector machine active learning with applications to text classification [J]. Journal of Machine Learning research (S1532-4435), 2001, 2(1): 45-66. 被引量:1
  • 9Wang Jing-pu, Lin Ya-ping, Zhou Shun-xian. Web Text Information Extraction on Wrapper Model [C]//2005 International Symposium on Computer Science and Technology. China: IEEE, 2005:607-612. 被引量:1
  • 10Kseymore. Data set for information extraction [DB/OL]. (1999)[2007]. http://www-2.c s.cmu.edu/-kseymore/ie.html. 被引量:1

共引文献87

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部