期刊文献+

新浪微博搜索排序方法研究

Research of Searching and Sorting Method of Sina Microblogging
下载PDF
导出
摘要 深入讨论了基于向量空间模型以及基于潜在语义分析的微博搜索排序算法,以新浪微博为例,通过建立实验系统,利用新浪微博公共开放平台提供的API获取实验数据,通过一个实验样例阐述向量空间模型和潜在语义分析的处理过程。新浪微博现有排序方法通常不能提供按照相关性排序的满意结果。利用向量空间模型以及潜在语义分析方法,构建"索引词-博文"矩阵,对博文进行分词和向量化。衡量博文和查询的相关度转化成计算博文向量和查询向量之间的相似度。把对博文和查询的处理简化为向量空间中向量的运算。由实验得知基于潜在语义分析的微博搜索排序算法有效地提高了博文的检索效率。 A searching and sorting method for Chinese microblog called Weibo is presented in this paper,based on the vector space model and latent semantic analysis.APIs,provided by the Sina microblogging public platform,are applied to obtain test data.Weibo posts using vector space model as matrix of "ndex-term content" are presented,and then a latent semantic analysis process on this matrix is performed.The relevance between Weibo contents and query was turned into the similarity between the Weibo content vector and query vector,which was calculated by the cosine value between Weibo content vector and inquiring vector decomposed by SVD.The treatment on the Weibo content and query was simplified as the operation for the vectors in the low-dimensional vector space.A sorting list of Weibo posts will be obtained according to their relevance to the query rather than the simple string-matching and post time descending order approach,which is widely used in many microblogging platforms.The experiment results indicate that the approach is able to retrieve the relevant posts in the top-ranked list.
出处 《常州大学学报(自然科学版)》 CAS 2013年第3期71-75,共5页 Journal of Changzhou University:Natural Science Edition
基金 国家自然科学基金项目(61003163) 江苏省科技厅项目(BZ2010021)
关键词 微博 向量空间模型 潜在语义分析 搜索排序 Weibo vector space model latent aemantic analysis search ranking
  • 相关文献

参考文献7

  • 1郑志娴.微博个性化内容推荐算法研究[J].电脑开发与应用,2012,25(12):23-25. 被引量:8
  • 2徐守坤,薛浩,李宁,马正华.一种基于本体服务索引的Web服务扩展方法[J].常州大学学报(自然科学版),2011,23(1):19-22. 被引量:1
  • 3肖红,许少华,李欣.具有三级索引词库结构的中文分词方法研究[J].计算机应用研究,2006,23(8):49-51. 被引量:16
  • 4Landauer T K,Dumais S T.A solution to Plato’ s problem: the latent semantic analysis theory of the acquisition, induction and representation of knowledgePsychological Review,1997. 被引量:1
  • 5Soumen Chakrabarti,Mukul M Joshi,Kunal Punera,andDavid M.Pennock.The structure of broad topics on theWebProceedings of The th World Wide Web Confer-ence,2002. 被引量:1
  • 6D. M. Pennock,G.W. Flake,S. Lawrence,E. Glover,C. Lee Giles.Winners don’t take all: Characterizing the competition for links on the webProceedings of the National Academy of Sciences of the United States of America,2002. 被引量:1
  • 7Morris M R,Teevan J,Panovich K.What do people ask their social networks, and why?: a survey study of status message Q&A behaviorProceedings of the th international conference on Human factors in computing systems,2010. 被引量:1

二级参考文献18

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部