期刊文献+

基于最小可查询模式的Deep Web查询

Deep Web query based on minimum executable pattern
下载PDF
导出
摘要 给出了最小可查询模式MEP的概念,并在此基础上提出了MEP生成算法与基于MEP的自适应查询方法。该方法将查询接口由单文本框推广到最小可查询模式集,一次查询由一个MEP和与该MEP匹配的关键词向量共同确定,自适应地产生期望最优的下一个查询,直到满足查询停止条件。该方法克服了当前Deep Web查询方法能力不足导致的"数据孤岛"问题。在6个实际Deep Web站点的实验表明,该方法比已有方法具有更强的查询能力与适用性。 This paper proposes the concept of minimum executable pattem(MEP), and then presents a MEP generation method and a MEP-based Deep Web adaptive query method. The query method extends query interface from single textbox to MEP set; it performs a query by choosing a MEP and a keyword vector of the MEP, and generates the next expected optimal query until stop condition is satisfied. The proposed method overcomes the problem of "Data Island" which results from deficiency of current methods. The experimental results on six real-world Deep Web sites show that our method outperforms existing methods in terms of query capability and applicability.
出处 《中国科技论文在线》 CAS 2010年第2期97-105,共9页
基金 国家自然科学基金(60825202 60803079) 国家高技术研究发展计划(863计划)(2008AA01Z131) 新世纪优秀人才支持计划(NECT-08-0433) 高等学校博士学科点专项科研基金(2009021110060)
关键词 DEEPWEB 最小可查询模式 自适应查询 Deep Web minimum executable pattern adaptive query
  • 相关文献

参考文献12

  • 1孟小峰,于戈.DeepWeb数据集成专刊前言[J].软件学报,2008,19(2):177-178. 被引量:1
  • 2郑冬冬,赵朋朋,崔志明.Deep Web爬虫研究与设计[J].清华大学学报(自然科学版),2005,45(S1):1896-1902. 被引量:28
  • 3Michael K Bergman.The Deep Web:surfacing hidden value. The Journal of Electronic Publishing from the University of Michigan . 2001 被引量:1
  • 4Barbosa L,Freire J.Siphoning hidden-Web data through keyword-based interfaces. Proceedings of the19th Brazilian Symposium on Databases (SBBD) . 2004 被引量:1
  • 5Wu P,Wen J R,Liu H,et al.Query selection techniques for efficient crawling of structured Web sources. Proceedings of the22nd International Conference on Data Engineering (ICDE) . 2006 被引量:1
  • 6Ipeirotis P,Gravano L.Distributed search over the hidden Web:hierarchical database sampling and selection. Proceedings of the28th International Conference on Very Large Databases (VLDB) . 2002 被引量:1
  • 7Mandelbrot B B.Fractal geometry of nature. . 1988 被引量:1
  • 8Raghavan S,Garcia-Molina H.Crawling the hidden Web. Proceedings of the 27th International Conference on Very Large Data Bases . 2001 被引量:1
  • 9Manuel Alvarez,Juan Raposo,Alberto Pan,Fdel Cacheda,Victor Carneiro.DeepBot: A Focused Crawler for Accessing Hidden Web Content. 3rd International Workshop on Data Engineering Issues in E-Commerce and Services, DEECS 2007 . 2007 被引量:1
  • 10Alexandros Ntoulas,Petros Zerfos,Junghoo Cho.Downloading Textual Hidden Web Content Through Keyword Queries. Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries . 2005 被引量:1

二级参考文献9

  • 1Walker Troy.Automating the extraction of domain-specific information from the web-a case study for the genealogical domain[].Brigham Young University.2004 被引量:1
  • 2Barbosa L,Freire J.Siphoning hidden-web data through keyword-based interfaces[].SBBD.2004 被引量:1
  • 3Modica G,Gal A,Jamil H M.The use of machine-generated ontologies in dynamic information seeking[].Proceedings of the th International Conference on Cooperative Information Systems.2001 被引量:1
  • 4Laender A H F,Ribeiro-Neto B,Silva A S da,et al.A brief survey of Web data extraction tools[].SIGMOD Record.2002 被引量:1
  • 5Raghavan S,Garcia-Molina H.Crawling the hidden Web[].Proceedings of the th International Conference on Very Large Data Bases.2001 被引量:1
  • 6Golgher P B,Laender A H F,Silva A S da,et al.An example-based environment for wrapper generation[].Proceedings of the nd International Workshop on The World Wide Web and Conceptual Modeling.2000 被引量:1
  • 7Muslea I,Minton S,Knoblock C.Hierarchical wrapper induction for semistructured information sources[].Autonomous Agents and Multi-A gent Systems.2001 被引量:1
  • 8Liddle S,Embley D,Scott D,et al.Extracting data behind Web forms[].Proceedings of the Workshop on Conceptual Modeling Approaches for e-Business.2002 被引量:1
  • 9Arvind A,Hector G M.Extracting structured data from Web pages[].ACM SIGCOMM.2003 被引量:1

共引文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部