摘要
对于链接预测问题,传统的预测模型通常仅考虑网络中节点的链接信息,而社会网络中普遍存在的文本信息可以用于提高链接预测的准确性,利用文本内容来帮助链接预测越发受到重视。结合文本上下文和网络链接,提出了一种基于层次隐狄利克雷分布主题模型的链接预测模型。模型通过层次隐狄利克雷分布模型对文本数据进行训练,从迭代收敛的主题树中提取文本相似特征,然后利用支持向量机模型来训练特征数据以提高链接预测的精度,并得到二元分类器,根据该分类器,可以预测文本与其他文本链接的可能性。实验结果表明,所提出的模型相比于已有的相关模型,提高了预测文本网络中文档之间链接的准确度。
In regard to link prediction problem,traditional prediction models usually only consider the link information of the nodes from the network. However,the text widely existing in the social networks can be used to improve the performance of link prediction,and using text for link prediction is getting attention increasingly. Combining text and links,proposes a link prediction model based on hierarchical latent dirichlet allocation topic model. First,the model trains text data by hierarchical latent dirichlet allocation model,then it extracts text similar features from the convergent topic tree,finally the model trains the feature data to obtain a two-class classifier by support vector machine model,this classifier can be used to predict the link between nodes. The experimental results demonstrate that,comparing to pre-existing similar models,the model proposed improves the accuracy of predicting the links among the documents in text network.
出处
《计算机与数字工程》
2017年第10期1990-1995,共6页
Computer & Digital Engineering
关键词
链接预测
层次隐狄利克雷分布
主题树
文本相似特征
支持向量机
link prediction
hierarchical latent dirichlet allocation
topic tree
text similar feature
support vector machine