摘要
当前,检索结果多样化作为一种提升用户满意度的有效方法已成为Web和数据库检索、文本摘要及推荐系统等领域的研究热点之一.但已有研究工作大都只考虑语义多样化策略.而实际上,多样化是一个非常复杂的优化问题,还需考虑许多其他的策略,如新颖性、质量、价值等.众所周知,Web是一个动态的信息空间,用户的查询需求也随时间不断演化,只有在一个特定的时间模式下,检索系统才能返回满意的结果.故该文提出一种新的结合语义和时效性两个维度的查询结果多样化方法.该文首先给出了多维度查询结果多样化框架的通用定义.然后,对于给定的查询,探讨了如何基于文档、词和查询频率来计算其时效性意图的概率分布.之后,提出一种新的针对时效性多样化的评价方法.最后,构建了针对多维度多样化问题的真实数据集,并通过实验证明该文提出的方法,不管是在传统的多样化评价指标上,还是在该文提出的时效性多样化指标上,性能都超过了当前主流的基准方法.
Result diversification has recently been an active research area aimed at improving user satisfaction in Web and database search,text summarization,as well as recommendation system.To the best of our knowledge,almost all existing work only takes semantic strategies into account.However,result diversity is a very complex optimization problem and there may be many other strategies to be considered,such as,freshness,quality,value and so on.Additionally,it is well known that the Web is a dynamic information space and many queries could only be answered accurately under a specific temporal pattern.In this paper we propose a novel multidimensional diversification framework which combines the temporal space and the semantic space together to generate diversified search results.Firstly,we give a formal definition of our multidimensional diversification framework.Then,we study how to compute the probability distribution of temporal intents directly based on document,word and query frequency data.And then,we present a new evaluation measure especially for temporal diversification.Finally,we construct a real-world dataset for multidimensional diversity problem.The experiments demonstrate that our method can outperform these baseline approaches significantly in terms of both popular diversified measures and a new measure proposed in this paper.
出处
《计算机学报》
EI
CSCD
北大核心
2015年第10期2076-2091,共16页
Chinese Journal of Computers
基金
国家自然科学基金(61272240
61103151
61173068)
教育部博士点基金(20110131110028)
山东省自然科学基金(ZR2012FM037)
山东省优秀中青年科学家科研奖励基金(BS2012DX017)资助~~
关键词
多维度多样化
时效性意图
子主题
语义
时间
社交网络
社会计算
multidimensional diversity
temporal intent
subtopic
semantic
time
social networks
social computing