期刊文献+

一种智能化的信息采集系统的研究与实现 被引量:3

Research and Implement of An Intelligent Information Acquisition System
下载PDF
导出
摘要 信息采集系统一般需要用户手动设置采集规则,对采集结果不进行处理并返回大量信息。为了简化用户的操作并直接得到所需结果,提出了一种智能化信息采集系统。面向专家信息的采集,基于搜索引擎,根据专家的姓名、工作单位和领域关键词,利用搜索引擎搜索与专家信息相关的网页,对网页文档进行规范化处理,并对网页的主体进行提取。经过语义相关度的计算来实现专家信息智能识别。测试结果显示,系统的采准率约为83.87%. Most collection systems require users to set collect rules and return large amount of information not processed.In order to simplify users' operation and directly obtain the required result,an intelligent information acquisition system was proposed.The system,which chooses experts' information composed of name,unite and field as collection object,searches the expert information automatically through the search engine,standardizes the hyper text markup language documents in order to find out the main text of the documents.The experts' information was identified by natural language processing and semantic relevancy.The tested results show that the system accuracy is about 83.87%.
出处 《兵工学报》 EI CAS CSCD 北大核心 2009年第S1期130-134,共5页 Acta Armamentarii
关键词 计算机应用技术 信息采集 智能化 主体文本选取 网页识别 computer application technology information acquisition intelligent extraction of main text recognizing of web pages
  • 相关文献

参考文献3

二级参考文献6

  • 1冯是聪 单松巍 张志刚 等.一个中文网页数据集及其分类体系[A]..海峡两岸技术交流会[C].南京,2002-10.121-129. 被引量:1
  • 2Yiming Yang,Jan O Pedersen.A comparative Study on Feature Selection in Text Categorization[C].In :Proceedings of the Fourteenth International Conference on Machine Leaming(ICML'97), 1997. 被引量:1
  • 3Yiming Yang,Xin Liu.A re-examination of text categorization methods[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR'99,1999:42---49. 被引量:1
  • 4Yiming Yang.A study on thresholding strategies for text categorization[C].In:Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'01),2001. 被引量:1
  • 5Yang Yiming,Proceedings of the 14th International Conference on Machine rning,1997年,412页 被引量:1
  • 6黄萱菁,吴立德.基于向量空间模型的文档分类系统[J].模式识别与人工智能,1998,11(2):147-153. 被引量:24

共引文献102

同被引文献58

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部