摘要
针对已有重叠社区检测通常只考虑节点的拓扑结构信息,忽略了节点的属性信息,导致数据间的重要结构遗漏的问题,提出了一种基于节点拓扑结构和属性相似度的重叠社区检测算法。首先,基于余弦相似度计算候选节点和局部社区之间的相似度,提高局部搜索效率;其次,改进局部模块度增量计算方法,使局部搜索模型收敛于发现潜在的真实社区;通过融合多个已检测到的局部社区计算隶属矩阵,从而获取全局重叠社区结构;最后,在真实数据集上,与已有基于拓扑结构的社区检测算法进行实验对比。结论表明,该算法在模块度和F1-measure的指标上取得了较好的表现且更适用于稀疏网络。
For overlapping community detection usually only considering the topology information of nodes, ignoring the node attribute information, causing the problem of missing important data structures, this paper proposed an overlapping community detection algorithm. The algorithm was based on the topology and the similarity of node attribute to find overlapping communities from a seed vertex. First, it computed the similarity between candidate nodes and the local community based on cosine similarity, improved the efficiency of local searching. Second, it improved the calculation method of local module-degree increments, made the local search model converge to find potential ground-truth communities. Third, it merged multiple local communities which have been detected and calculate a membership matrix, which can be seen as the global overlapping community structure of a graph. At the end, compared with the existing community detection algorithm based on topology on real data sets. The results show that the algorithm performs better on the modularity and F1 -measure. And it is more suitable for sparse networks.
作者
许加书
韩忠愿
顾惠健
Xu Jiashu Han Zhongyuan Gu Huijian(College of Information & Engineering, Nanjing University of Finance & Economics, Nanjing 210046, China)
出处
《计算机应用研究》
CSCD
北大核心
2016年第12期3615-3619,共5页
Application Research of Computers
关键词
社区检测
节点属性
重叠社区
隶属矩阵
模块度
community detection
node attribute
overlapping communities
membership matrix
modularity