Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clust...Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clustering loses locality and is impractical for large graphs.While some attempts have been made to address the efficiency bottleneck,there is still no applicable algorithm for large scale graphs with billions of edges.In this paper,we propose a purely local and index-free method called Index-free Triangle-based Graph Local Clustering(TGLC^(*))to solve the MGLC problem w.r.t.a triangle.TGLC^(*)directly estimates the Personalized PageRank(PPR)vector using random walks with the desired triangleweighted distribution and proposes the clustering result using a standard sweep procedure.We demonstrate TGLC^(*)’s scalability through theoretical analysis and its practical benefits through a novel visualization layout.TGLC^(*)is the first algorithm to solve the MGLC problem without precomputing the motif weight.Extensive experiments on seven real-world large-scale datasets show that TGLC^(*)is applicable and scalable for large graphs.展开更多
本文提出利用中国第1颗可操作性静止气象卫星风云2号C星(FY-2C)数据结合中等分辨率航天成像光谱仪MODIS产品估算河北灌溉农田实际蒸散量(ET)的方法,其中FY-2C的第1、2波段用于反演区域地表温度,再结合16 d MODIS合成的植被指数产品(MOD1...本文提出利用中国第1颗可操作性静止气象卫星风云2号C星(FY-2C)数据结合中等分辨率航天成像光谱仪MODIS产品估算河北灌溉农田实际蒸散量(ET)的方法,其中FY-2C的第1、2波段用于反演区域地表温度,再结合16 d MODIS合成的植被指数产品(MOD13),得到地表温度与植被指数的三角空间分布图(Ts-NDVI)。通过Ts-NDVI空间分布的关系,利用改良三角算法得到区域的蒸发比(EF)。最后结合MODIS地表反射率产品MCD43估算得到的日净辐射量,根据能量平衡计算得到该地区的日实际蒸散量。模型结果与地表Lysimeter观测数据比较,显示该模型估算得到的蒸发比和日蒸散量结果较为合理,误差在可接受范围。此外,FY-2C用于估算地表ET,其时间分辨率具有较强的优势,从而为获得多幅无云蒸散图提供了有利条件。展开更多
Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weig...Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.展开更多
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)is a popular method for graph mining tasks due to its various applications.However,the traditional two-phase approach of precomputing motif weights before performing local clustering loses locality and is impractical for large graphs.While some attempts have been made to address the efficiency bottleneck,there is still no applicable algorithm for large scale graphs with billions of edges.In this paper,we propose a purely local and index-free method called Index-free Triangle-based Graph Local Clustering(TGLC^(*))to solve the MGLC problem w.r.t.a triangle.TGLC^(*)directly estimates the Personalized PageRank(PPR)vector using random walks with the desired triangleweighted distribution and proposes the clustering result using a standard sweep procedure.We demonstrate TGLC^(*)’s scalability through theoretical analysis and its practical benefits through a novel visualization layout.TGLC^(*)is the first algorithm to solve the MGLC problem without precomputing the motif weight.Extensive experiments on seven real-world large-scale datasets show that TGLC^(*)is applicable and scalable for large graphs.
文摘本文提出利用中国第1颗可操作性静止气象卫星风云2号C星(FY-2C)数据结合中等分辨率航天成像光谱仪MODIS产品估算河北灌溉农田实际蒸散量(ET)的方法,其中FY-2C的第1、2波段用于反演区域地表温度,再结合16 d MODIS合成的植被指数产品(MOD13),得到地表温度与植被指数的三角空间分布图(Ts-NDVI)。通过Ts-NDVI空间分布的关系,利用改良三角算法得到区域的蒸发比(EF)。最后结合MODIS地表反射率产品MCD43估算得到的日净辐射量,根据能量平衡计算得到该地区的日实际蒸散量。模型结果与地表Lysimeter观测数据比较,显示该模型估算得到的蒸发比和日蒸散量结果较为合理,误差在可接受范围。此外,FY-2C用于估算地表ET,其时间分辨率具有较强的优势,从而为获得多幅无云蒸散图提供了有利条件。
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.