摘要
相比于确定图上的相似性连接,不确定图上的相似性连接通常具有更大的实际应用价值以及计算复杂性。文中研究了基于MapReduce分布式编程框架的不确定图上的相似性连接问题,提出了基于概率和的Map方剪枝和Reduce方剪枝的两种剪枝策略。Map方剪枝策略在映射过程中过滤掉了不可能具有相似图的不确定图。Reduce方剪枝策略用于减少约减过程中的候选图对。基于这两种剪枝策略,文中提出了一种基于MapReduce框架的不确定图上的相似性连接算法MUGSJoin。实验结果证明,该算法与同类算法相比具有更好的性能和可扩展性。
Compared to the similarity join on the deterministic graph,the similarity join on the uncertain graph usually has greater practical application value and higher computational complexity.This paper studied the similarity join on uncertain graph databases based on MapReduce distributed programming framework,and proposed two pruning strategies using the existence probability,namely Map Side Pruning and Reduce Side Pruning.The Map Side Pruning can filter out the uncertain graphs at the Map side which have no chance to be similar with any other uncertain graph at the Map side.The Reduce Side Pruning can reduce the candidate pairs at Reduce side in the process of reduction.Based on the above pruning approaches,this paper proposed a similarity join algorithm MUGSJoin on uncertain graph databases based on MapReduce.The experiment results show that the proposed approach has much better performance and scala-bility than the similar algorithms.
作者
缪丰羽
王宏志
阮群生
MIAO Feng-yu;WANG Hong-zhi;RUAN Qun-sheng(College of Information&Mechanical and Electrical Engineering,Ningde Normal University,Ningde,Fujian 352100,China;School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China;College of Software,Xiamen University,Xiamen,Fujian 361000,China)
出处
《计算机科学》
CSCD
北大核心
2018年第12期298-307,共10页
Computer Science
基金
国家自然科学基金重点项目(U1509216)
国家自然科学基金面上项目(61472099
61602129)
国家重点研发计划项目(2016YFB1000703)
福建省自然科学基金面上项目(2018J01555)
福建省教育厅中青年项目(JAT170653)资助