期刊文献+

基于频繁结构的Deep Web查询接口集成

Research of the Deep Web Query Interface Integration Based on the Frequent Structure
下载PDF
导出
摘要 随着网络规模的日益扩大,海量的信息被"深藏"于各类在线数据库中,用户只能通过查询接口才能获取其中的数据,这部分内容称之为Deep Web;因此对同一领域的Deep Web数据进行集成是非常必要的。查询接口的集成是其中一个非常关键的子问题。查询接口的集成分为模式匹配和模式集成两个步骤;重点研究集成查询接口中属性布局的确定。Deep Web中查询接口数量巨大,以及动态性与异构性的特点给该问题带来了巨大的挑战。将查询接口的结构建模成一棵树,然后通过挖掘频繁的模式子树来构建集成的查询接口树,使其最大化地满足属性间的结构约束和顺序约束。该算法具有较低的时间复杂度,并具有很好的扩展性,对八个领域的查询接口进行集成的实验结果证明了算法的有效性。 With the rapid expansion of the network scale, massive information is hidden in various types of online databases, and the data have to accessed through the query interface, which is called Deep Web. It is very necessary to integrate the same field data in the Deep Web, and query interface integration is one of the key problems. Query interface integration is divided into two steps as pattern matching and pattern integration, the study of how to determine the integrated query interface properties layout was focused on. Deep Web has a great number of query interfaces, and the dynamic and heterogeneous characteristics to this question brought enormous challenge. The query interface structure was modeled as a tree, and then through the mining frequent sub pattern tree the integrated query interface tree was constructed, so that the maximum satisfaction of attributes between structural constraints and sequence constraints could be obtained. The algorithm has low time complexity and well expansibility. The experiment results prove the proposed algorithm is effective in eight areas of the query interface integration.
出处 《科学技术与工程》 北大核心 2014年第18期81-88,93,共9页 Science Technology and Engineering
基金 贵州省联合基金项目(黔科合J字LKQS[2013]29号 黔科合J字LKQS[2013]13号)资助
关键词 频繁结构 查询接口 属性布局 模式子树 查询接口树 frequent structure query interface attribute layout pattern sub tree queryinterface tree
  • 相关文献

参考文献18

  • 1Bergman M K. The deep Web: surfacing hidden value. Bright Planet In Journal of Electronic, 2001 ; 7( 1 ) :8912-8914. 被引量:1
  • 2Chang K C, He B, Li C, et al. Structured databases on the web : observations and implications. SIGMOD ,2004,33(3 ) :61-70. 被引量:1
  • 3刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 4He B, Chang K. Statistical schema matching across web query interfaces. Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data,2003:217-228. 被引量:1
  • 5He B, Chang K, Han J. Discovering complex matchings across web query interfaces: a correlation mining approach. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discov- ery and Data Mining, Seattle, Washington, USA, 2004:22-25. 被引量:1
  • 6Wang Y, Peng Tao, Zuo Wanli, et al. Automatic integration of deep web query interfaces based on Ontology. ICCIT, 2009. 被引量:1
  • 7Wu W, Yu C, Doan A, et al. An interactive clustering-based ap- proach to integrating source query interfaces on the deep web. SIG- MOD, 2004. 被引量:1
  • 8DrGAut E C, Kabisch T, Yu C, et al. A hierarchical approach to model web query interfaces for web source integration. VLDB, 2009. 被引量:1
  • 9WangYing , Zuo Wanli, Peng Tao, et al. Integration of query inter- faces for deep web databases. ICNC, 2008. 被引量:1
  • 10He H, Meng W,Yu C, et aL WISE-integrator: an automatic integrator of web search interfaces for E-commerce. VLDB ,2003. 被引量:1

二级参考文献60

  • 1.[EB/OL].http://www.cogsci.Princeton.edu,. 被引量:2
  • 2Fetterly D,Manasse M,Najork M,Wiener J L.A largescale study of the evolution of Web pages//Proceedings of the 12th International World Wide Web Conference.Budapest,2003:669-678 被引量:1
  • 3Chang K C,He B,Li C,Patel M,Zhang Z.Structured databases on the Web:Observations and Implications.SIGMOD Record,2004,33(3):61-70 被引量:1
  • 4Cope J,Craswell N,Hawking D.Automated discovery of search interfaces on the Web//Proceedings of the 14th Australasian Database Conference(ADC 2003).Adelaide,2003:181-189 被引量:1
  • 5Zhang Z,He B,Chang K C.Understanding Web query interfaces:Best-effort parsing with hidden syntax//Proceedings of the 23rd ACM SIGMOD International Conference on Management of Data.Paris,2004:107-118 被引量:1
  • 6Arasu A,Garcia-Molina H.Extracting structured data from Web pages//Proceedings of the 22nd ACM SIGMOD International Conference on Management of Data.San Diego,2003:337-348 被引量:1
  • 7Crescenzi V,Mecca G,Merialdo P.RoadRunner:Towards automatic data extraction from large Web sites//Proceedings of the 27th International Conference on Very Large Data Bases.Italy,2001:109-118 被引量:1
  • 8Wittenburg K,Weitzman L.Visual grammars and incremental parsing for interface languages//Proceedings of the IEEE Symposium on Visual Languages (VL).Skokie,1990:111-118 被引量:1
  • 9He H,Meng W,Yu C T,Wu Z.WISE-integrator:An automatic integrator of Web search interfaces for e-commerce//Proceedings of the 29th International Conference on Very Large Data Bases.Berlin,2003:357-368 被引量:1
  • 10Peng Q,Meng W,He H,Yu C T.WISE-cluster:Clustering e-commerce search engines automatically//Proceedings of the 6th ACM International Workshop on Web Information and Data Management.Washington,2004:104-111 被引量:1

共引文献135

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部