期刊文献+

地名地址基因的网页文本地名地址提取算法 被引量:4

Extraction algorithm of place name and address with text format in web pages based on the place name and address gene
原文传递
导出
摘要 针对网页文本蕴含着丰富的地名地址空间信息,但因其描述的随机性、多样性,导致信息很难被快速、准确地识别出来的问题。该文在分析网页文本中地名地址组成特点的基础上,考虑地名地址的事件属性,提出了一种基于"地名地址基因"的信息提取方法,依据事件相关度、地名地址的字符长度等提取因子建立提取规则树获取目标地名地址。实际数据测试表明该方法在地名地址提取上更具针对性,提高了效率和准确率。 Aiming at the problem that web text contains a wealth of address space information,but it is difficult to identify and extract because the address are described randomly and diversely.This paper presented a new method for the address extraction based on the the place name and address genes library after analyzing the characteristics of them.In this paper,a extraction rule tree was established according to event attributes,character length and word frequency of the address.The actual data tests showed that the method was more specific,and the efficiency and accuracy were improved.
作者 杜中波 刘新 宋婷婷 梁冰 周新宇 DU Zhongbo;LIU Xin;SONG Tingting;LIANG Bing;ZHOU Xinyu(College of Geomatics,Shandong University of Science and Technology,Qingdao,Shandong 266590,China;Key Laboratory of Fundamental Geographic Information and Digital Technology of Shandong Province, Shandong University of Science and Technology,Qingdao,Shandong 266590,China;Chinese Academy of Surveying and Mapping, Beijing 100036, China;Urban Planning Management Information Center of Beijing Xicheng District, Beijing 100035, China)
出处 《测绘科学》 CSCD 北大核心 2019年第4期196-202,共7页 Science of Surveying and Mapping
基金 测绘地理信息公益性行业科研专项(201512020) 中国测绘科学研究院基本科研业务费项目(7771607) 西城区科技项目(SD2015-25)
关键词 地名地址基因 网页信息 事件属性 规则树 place name and address gene web page information event attributes rule tree
  • 相关文献

参考文献11

二级参考文献106

共引文献203

同被引文献46

引证文献4

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部