摘要
区分子图可以用来描述复杂的图数据结构和构建高效的图分类模型。提出了多样性度量的Top-K区分子图挖掘问题,避免了挖掘结果之间出现高度相关的子图模式,提高了区分子图模式的可用性。通过组合图结构相似性与支持集相似性约束,给出图模式的多样性度量标准。提出两个高效算法Greedy-TopK和LeapTopK挖掘多样性度量的Top-K区分子图。Greedy-TopK算法采用两阶段的增量式贪婪方法快速挖掘K个区分子图模式。Leap-TopK算法通过在挖掘过程中限制扩展结构相似的图模式,实现了跳跃搜索子图模式空间。实验结果表明,Leap-TopK算法的效率明显优于Greedy-TopK算法;在可用性方面,利用Leap-TopK算法与Greedy-TopK算法挖掘结果构建的图分类器具有相似的分类精度,且都优于传统区分子图挖掘算法产生的结果。
Discriminative subgraph can be used to characterize complex graph structures and construct efficient graph classification model.This paper proposes the Top-K discriminative subgraph mining problem based on diversity measure.The diversity measure can be used to mine low correlation subgraph patterns in the mining result,which can enhance the usefulness of the discriminative subgraph patterns.By exploiting the graph structure similarity and support set similarity restraints,this paper introduces the criterion of graph pattern diversity measure.Then this paper proposes two efficient algorithms,Greedy-TopK and Leap-TopK,for the problem.Greedy-TopK algorithm employs two step strategies to incrementally and greedily mine K discriminative subgraphs.By limiting the structure similarity graph pattern extension in the discriminative subgraph mining process,Leap-TopK algorithm can leap the graph pattern searching space.Extensive experimental results demonstrate that Leap-TopK algorithm is more efficient than Greedy-TopK algorithm.Besides,when the mining results of discriminative subgraphs are considered,the classification accuracies of the two algorithms are almost the same.But they are all superior to the traditional discriminative subgraph mining algorithm in terms of usefulness.
作者
王章辉
赵宇海
王国仁
李源
WANG Zhanghui;ZHAO Yuhai;WANG Guoren;LI Yuan(School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China)
出处
《计算机科学与探索》
CSCD
北大核心
2017年第9期1379-1388,共10页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金(Nos.61272182
61332014
61173029)
国家自然科学重点项目(No.U1401256)
中央高校基本科研业务费专项资金(No.N150402002)~~
关键词
图挖掘
图分类
子图模式
区分子图
多样性
graph mining
graph classification
subgraph pattern
discriminative subgraph
diversity