期刊文献+

异构信息网上的可达性查询 被引量:4

Reachability Query in Heterogeneous Information Networks
下载PDF
导出
摘要 随着图数据规模的爆炸式增长,其形式也越来越复杂.异构信息网可建模成包含多种类型的顶点和多种类型的边的图.例如,文献数据库、在线购物网站等.首次研究异构信息网上的可达性查询问题.利用不同类型顶点之间的关系,查询2个顶点满足路径模式的可达性,该问题的时间复杂度是多项式的.然而在大规模的网络上,每次查询遍历一遍网络的时间开销也是不能容忍的.现有的可达性查询问题主要分为2类:k跳可达性查询和带有标签约束的可达性查询.但是这2种问题的算法都不能用于解决异构信息网上的可达性查询问题.因此,为了实现高效的在线查询,提出一种新的索引结构,通过路径模式的分解,预先计算部分路径模式的可达信息.当在线查询到来时,在路径模式的偏序图上,快速找到索引结构中存在的路径子模式,高效地计算查询结果.在真实和人工数据集上进行了大量实验,验证了算法的有效性. With the size of graph data increasing explosively, the form of graph data is much more complicated. Heterogeneous information networks can be modeled as graphs, which contain multiple types of nodes and multiple types of edges, e.g., bibliographic database, online shopping website and knowledge graphs. Reachability query in heterogeneous information networks is investigated in this paper. By using different types of relationships between nodes, the reachability of nodes is queried while satisfying specific path schema. This problem is polynomial. However, the time costing can’t be tolerant by scanning the big network for answering one query. The existing reachability work can be classified into two categories: K-hop reachability query and label-constraint reachability query. But these techniques can’t be used for processing path-based reachability query in heterogeneous information networks. Therefore, in order to respond online queries efficiently, a novel index structure is proposed, which decomposes path schemas and precomputes the reachability of nodes in sub-path schemas. Online query is computed efficiently by decomposing the query path schema and using the reachability of the indexes. A path schema decomposition strategy is developed by searching the partial order graph of path schemas in order to minimize the query time. Experiments on real world and synthetic data demonstrate the effectiveness of algorithms for reachability query in heterogeneous information networks.
出处 《计算机研究与发展》 EI CSCD 北大核心 2016年第2期479-491,共13页 Journal of Computer Research and Development
基金 国家"九七三"重点基础研究发展计划基金项目(2012CB316200) 国家自然科学基金重点项目(61033015) 国家自然科学基金重大项目(61190115 60933001) 国家自然科学基金面上项目(61173023)~~
关键词 异构信息网 查询处理 可达性 路径模式 索引 heterogeneous information network query processing reachability path schema index
  • 相关文献

参考文献23

  • 1富丽贞,孟小峰.大规模图数据可达性索引技术:现状与展望[J].计算机研究与发展,2015,52(1):116-129. 被引量:16
  • 2Agrawal R, Borgida A, Jagadish H V. Efficient management of transitive relationships in large data and knowledge bases [C]//Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1989:253-262. 被引量:1
  • 3Jin R, Xiang Y, Ruan N, et al. 3 hop: A high compression indexing scheme for reachability query [C]//Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2009:813-826. 被引量:1
  • 4Jin R, Xiang Y, Ruan N, et al. Efficiently answering reaehability queries on very large directed graphs [C]//Proc of ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2008:595-608. 被引量:1
  • 5Wang Haixun, He Hao, Yang Jun, et al. Dual labeling: Answering graph reaehability queries in constant time [C] // Proc of the 22nd Int Conf on Data Engineering. Piscataway, NJ: IEEE, 2006: 75-75. 被引量:1
  • 6Cheng J, Shang Z, Cheng H, et al. K-reach: Who is in your small world [J]. Proceedings of the VLDB Endowment, 2012, 5(11): 1292-1303. 被引量:1
  • 7Cohen E, Halperin E, Kaplan H, ct al. Reaehability and distance queries via 2-hop labels [J]. SlAM Journal on Computing, 2003, 32(5): 1338-1355. 被引量:1
  • 8Wei F. TEDI: Efficient shortest path query answering on graphs[C]//Proc of Int Conf on Management of Data. New York: ACM, 2010: 99-110. 被引量:1
  • 9Cheng J, Yu J X. On-Line exact shortest distance query processing [C] //Proc of the 12th Int Conf on Extending Database Technology. New York: ACM, 2009:481-492. 被引量:1
  • 10Xiao Y H, Wu W T, Pei J, et al. Efficiently indexing shortest paths by exploiting symmetry in graphs [C] //Proc of the 12th Int Conf on Extending Database Technology. New York: ACM, 2009:493-504. 被引量:1

二级参考文献52

  • 1Yildirim H.Scalable reachability indexing for very largeGraphs[D].Troy,NY:Computer Science,RensselaerPolytechnic Institute,2011. 被引量:1
  • 2Quin L.Extensible Markup Language XML(XML)[S],Cambridge:The World Wide Web Consortium(W3C),2013. 被引量:1
  • 3Hawke S,Herman I.Resource Description Framework(RDF)[S].Cambridge:The World Wide Web Consortium(W3C),2013. 被引量:1
  • 4Wang H,Li J,Luo J,et al.Hash-base subgraph queryprocessing method for graph-structured XML documents[J].Proceedings of the VLDB Endowment,2008,1(1):478-489. 被引量:1
  • 5Cheng J,Yu J.On-line exact shortest distance queryprocessing[C]//Proc of ACM EDBTO9.New York:ACM,2009:481-492. 被引量:1
  • 6Wei F.丁EDI:Efficient shortest path query answering ongraphs[C]//Proc of ACM SIGMOD10.New York:ACM,2010:99-110. 被引量:1
  • 7Cormcn T,Leiserson C,Rivest R,et al.Introduction toAigoriihms[M].Cambridge:MIT Press,2001:595-601. 被引量:1
  • 8Boag S,Chamberlin D.An XML Query Language(XQuery),Version 2.0[S].Cambridge:The World WideWeb Consortium(W3C),2013. 被引量:1
  • 9Harris G.An Query Language for RDF(SPARQL),Version 1.1[S].Cambridge:The World Wide WebConsortium C W3C),2011. 被引量:1
  • 10Jagadish H.A compression technique to materialize transitiveclosure[J].ACM Trans on Database System,1990,15(4):558-598. 被引量:1

共引文献15

同被引文献24

引证文献4

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部