摘要
针对论文引用预测方法在特征稀疏时性能下降的问题,提出了基于异构特征融合的方法,可同时利用定长特征、引文网络特征和引文时序特征,有效提升了引用预测方法的精度。本文针对论文引用预测任务定义了引文属性网络,对3类异构特征进行建模;提出了面向异构特征融合的论文引用预测方法,使用图神经网络处理定长特征和引文网络特征,使用循环神经网络处理引文时序特征,基于多头注意力机制对提取到的异构特征进行融合并预测被引次数。在大规模真实数据集上的实验表明,本文方法可以有效利用多种异构特征并缓解数据稀疏问题,均方根误差(Root mean squatr error,RMSE)比最好的基准方法降低了0.31。
Aiming at the problem that the performance of the paper citation prediction method is degraded when the features are sparse,a method based on heterogeneous feature fusion is proposed,thus can use fixed-length features,citation network features and citation time series features at the same time,thus effectively improving the accuracy of the citation prediction method.Firstly,this paper defines a citation attribute network for the paper citation prediction task,and models three types of heterogeneous features.Secondly,a paper citation prediction method for heterogeneous feature fusion is proposed.The method uses the graph neural network to process fixed-length features and citation network features,uses the recurrent neural network to process citation time series features,and fuses the extracted heterogeneous features and predicts the number of citations based on multi-head attention mechanism.Experiments on large-scale real datasets show that the proposed method can effectively utilize multiple heterogeneous features and alleviate the problem of data sparsity,and its root mean square error(RMSE) is 0.31 lower than that of the best benchmark method.
作者
朱丹浩
黄肖宇
ZHU Danhao;HUANG Xiaoyu(Department of Criminal Science and Technology,Jiangsu Police Institute,Nanjing 210031,China;Department of Computer Information and Network Security,Nanjing 210031,China)
出处
《数据采集与处理》
CSCD
北大核心
2022年第5期1134-1144,共11页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(71974094)
江苏省社会科学基金(19TQD002)
江苏省教育厅自然科学项目(21JHB520004)。
关键词
引用预测
循环神经网络
图神经网络
异构特征
注意力
citation forecast
recurrent neural network
graph neural network
heterogeneous features
attention