摘要
根据元路径和可交换矩阵,结合节点一阶和二阶相似性得到最后的传播概率矩阵;利用降噪自动编码器对传播概率矩阵进行降维得到异构信息网络的节点表示;将异构信息网络的节点表示用梯度提升树(GBDT)分类,得到不同百分比训练集下的分类准确率,用聚类指标标准化互信息(NMI)评价聚类效果,用T-SNE展现可视化效果.在数据集DBLP和AMiner上分别进行实验,相比DeepWalk、node2vec和metapath2vec方法,在应用任务节点分类上,所提出的基于传播概率矩阵的异构信息网络表示学习(HINtpm)的准确率与DeepWalk相比最高提升了24%,聚类指标NMI与DeepWalk相比最高提升了13%.
First,the final probability transition matrix was obtained by combining the first-order and second-order similarity of the nodes,according to the meta-path and the commuting matrix.Then,a Denoisin Auto-encoder was used to reduce the dimension of probability transition matrix for getting the node representation in heterogeneous information network.Finally,the node representation in heterogeneous information network was classified by gradient boosting decision tree(GBDT)and the classification accuracy under different percentage training set was obtained.Use the clustering index normalized mutual information(NMI)to evaluate the clustering effect and use T-SNE to show the visual effect.Experiments were performed on data sets DBLP and AMiner.The proposed heterogeneous information network representation learning based on transition probability matrix(HINtpm)was compared with DeepWalk,node2vec and metapath2vec methods.As results,compared with DeepWalk method,HINtpm improved the classification accuracy by 24%the maximum on the application task-node classify and increased the clustering index NMI by 13%the maximum.
作者
赵廷廷
王喆
卢奕南
ZHAO Ting-ting;WANG zhe;LU Yi-nan(College of Computer Science and Technology,Key Laboratory of Symbolic Computation and Knowledge Engineering,Ministry of Education,Jilin University,Changchun 130012,China)
出处
《浙江大学学报(工学版)》
EI
CAS
CSCD
北大核心
2019年第3期548-554,共7页
Journal of Zhejiang University:Engineering Science
基金
国家自然科学基金资助项目(61472519)
吉林省科技厅自然科学基金资助项目(20180101036JC)
吉林省科技发展计划资助项目(20180101054JC)
关键词
网络表示学习
异构信息网络(HIN)
传播概率矩阵
元路径
节点相似性
自动编码器
network representation learning
heterogeneous information network(HIN)
transition probabilitymatrix
meta-path
nodes'similarity
auto-encoder