期刊文献+

健康领域Web信息抽取 被引量:6

Web information extraction in health field
下载PDF
导出
摘要 针对Web信息抽取(WIE)技术在健康领域应用的问题,提出了一种基于Web Harvest的健康领域Web信息抽取方法。通过对不同健康网站的结构分析设计健康实体的抽取规则,实现了基于Web Harvest的自动抽取健康实体及其属性的算法;再把抽取的实体及其属性进行一致性检查后存入关系数据库中,然后对关系数据库中隐含健康实体的属性值利用Ansj自然语言处理方法进行实体识别,进而抽取健康实体之间的联系。该技术在健康实体抽取实验中,平均F值达到99.9%,在实体联系抽取实验中,平均F值达到80.51%。实验结果表明提出的Web信息抽取技术在健康领域抽取的健康信息具有较高的质量和可信性。 For the question how to apply the Web Information Extraction( WIE) technology to health field, a Web information extraction method based on Web Harvest was proposed. Through the structure analysis of different health Web sites and the design of health entity extraction rules, the automatic extraction algorithm of health entity and its attributes based on Web Harvest was realized; then they were stored in a relational database after consistency check; in the end, the values of entity attributes were analyzed to recognize entities by using processing method of natural language Ansj to extract relationship among entities. In the health entity extraction experiments, the average F-measure of the technology reached 99. 9%; in the entity contact extraction experiments, the average F-measure reached 80. 51%. The experimental results show that the proposed Web information extraction technology has high quality and credibility in the health information extraction.
出处 《计算机应用》 CSCD 北大核心 2016年第1期163-170,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(61073057)~~
关键词 信息抽取 健康信息抽取 一致性检查 实体识别 实体联系抽取 information extraction health information extraction consistency check entity recognition entity relationship extraction
  • 相关文献

参考文献16

二级参考文献225

共引文献280

同被引文献41

引证文献6

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部