期刊文献+

多源Web对象与关系数据的集成 被引量:1

Integrating Web objects extracted from multiple sites into relational database
下载PDF
导出
摘要 利用序列数据语义标注学习方法来解决异构数据源的模式匹配问题,将从多个网站抽取的异构Web对象集成到关系数据库中.在线性链条件随机场的基础上提出了一种可叠加多阶链的组合条件随机场模型.该模型可以在由手工标注数据和关系数据库记录组成的联合样本集上进行训练,因此减少了对繁琐手工标注样本的依赖;此外,通过在线性链条件随机场模型上叠加高阶链,使得该模型能够有效地处理状态变量间的长距离依赖.在多个领域的真实数据集上的实验和分析结果表明,所提出的方法能显著提高异构Web数据的字段标注性能. This paper studies the problem of integrating heterogeneous semi-structured Web objects into relational database. A generalized sequential learning model named the Combined Conditional Random Fields is presented for solving the problem of schema matching between pairs of heterogeneous Web data sources. The proposed model is able to learn on the manually labeled training data and unlabeled database records, thereby reducing the dependence on tediously labeled samples. It also provides a novel way to incorporate the two-dimensional neighborhood dependencies between Web data elements. Moreover, a constrained Viterbi algorithm is implemented to resolve the imposed labels inference for optimal data integration. Experimental results using a large number of Web pages from diverse domains show that the proposed method can improve the matching accuracy significantly.
出处 《西安电子科技大学学报》 EI CAS CSCD 北大核心 2007年第1期126-130,153,共6页 Journal of Xidian University
基金 国家部委预研项目(41101050108) 西安电子科技大学博士生创新基金项目(05013)
关键词 WEB数据集成 模式匹配 组合条件随机场 Web data integration schema matching conditional random fields
  • 相关文献

参考文献1

二级参考文献2

共引文献76

同被引文献13

  • 1陈跃国,王京春.数据集成综述[J].计算机科学,2004,31(5):48-51. 被引量:140
  • 2刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 3李建中,王珊.数据库系统原理[M].第2版.北京:电子工业出版社,2004. 被引量:1
  • 4Klug A. Equivalence of relational algebra and relational cal- culus query languages having aggregate functions [ J ]. Journal of the ACM, 1982,29 (3) :699-717. 被引量:1
  • 5Melton J, Simon A R. SQL: 1999 understanding relational language components [ M ]. [ s. 1. ] : Morgan Kaufmann Pub- lishers, Inc. ,2002. 被引量:1
  • 6Ozsoyoglu G, Oszoyoglu Z M, Matos V. Extending relational algebra and relational calculus with set-values attributes and aggregate functions[ J], ACM Transactions on Database Sys- tems, 1987,12 ( 4 ) :566-592. 被引量:1
  • 7Ramakrishnan R, Donjerkovic D, Ranganathan A, et al. SR- QL: sorted relational query language [ C ]//Proc of SS - DBM'98. [s. 1. ] :[s.n. ] ,1998. 被引量:1
  • 8Raman V, Hellerstein J M. Potter' s wheel : an interactive data cleaning system [ C ]//Proc of the international conference on very large data bases. [ s. 1. ] :[ s. n. ] ,2001. 被引量:1
  • 9Carreira P,Lopes A,Galhardas FI,et al. Extending the rela- tional algebra with the mapper operator [ J/OL ]. 2005. ht- tp ://www. inesc-id, plVficheiros/publicacoes/2322, pdf. 被引量:1
  • 10杨岳,郭绍忠,何晓忠.基于扩展关系代数的数据集成映射模式的研究[J].计算机应用,2009,29(12):3290-3292. 被引量:4

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部